Ls 
x 
x 
x 
4 
x 
x 
& 
x 
x 
x 
« 
x 
x 
« 
x 
x 
x 
x 
« 
x 
x 
x 
x 
x“ 
x 
x 
x 
x 

x 

x“ 

x 

bg 
x 

x 
x 


x «x MM x x x x x 6 x 6x x MM x x x x x x XX x ¥ x OM x x 6x x x x x x x 

x « -« xX & x @ x x x x x x &@ x 6 6x x € x x x x x un 6x x ™-“ mM x x x 6 x x x 3 
x 6M x «x x 6m x x 6M x x x = Xx x x x x x 6x x x x x x « NM x x 6 x x 2x 

x x MM €@ x x x x x x x 6x x -« x x x x x 8 & x x x x u , ie § x x x ¢@ x «M & ’ 
x x © x x x x x x MM x x 6 x «x x €@& x x x x x x “x & x x x 6& x & x 

x x x x x x x @« x x x x € x x x OM x @ at x & x x -« ® “a x x x @ x x ; 
x x 6x x te. § x © x = 6x x x «x x 6g x x & x 6x x «x x x 6 6©x x x x x x x x x 


Lg « 
x 
« 

4 x" 
x 

R x 
x 

x « 
x 

x x 
x“ 

x x 
x 

x « 
x 

+s L§ 
x 

x“ x 
x 

x x 
« 

x x 
x 

« « 
L§ 

x“ x 
x 

x x 
4 

x x 
x 

a! x 
x 

x x 
x 

x x 
x 

x x 
x 

x“ x 
tg 

x x 
4 

x 4 
x 

4 x 
x 

x Xx 
4 

x x 
x 

x x 
x 

4 x 
+ § 

x x 
3 

x 4 
x 

x“ x 
+ 

x x 
x 

Ls x 
§ 


« 
x 
x 
x 
ta! 
x 
x 
x 
x 
" 
x 
+ 4 
x 
x 
x 
x 
« 
x 
x 
x 
x 
x 
x 
x 
x 
x 
x 
x 
x 
« 
x 
§ 


ie 


x 
x 
x 
x 
x 
x 
x 
x 
x 
x 
x 
4 
x“ 
x 
« 
4 
x 
x 
x 
tg 
x 
x 
x 
4 
4 

4 
x 
x 

™ 
x 
x 

x 
x 


; 


‘ 


x 
x 
x 
& 
x 
x 
x 
« 
x 
x 
x 
x 
x" 
x 
L$ 
x 
x 
x 
xu 
x 
« 
x 
x 
x 
x 
x“ 
x 
x 
x 
x 
x 
« 
x 


i 


: 


x 
a 
x 
x 
x 


ag 


put 


x 
a ~ 
ae ge Se ee oe loa ee See cot see —. 


Abstract Algebra 
A Computational Approach 


Abstract Algebra 
A Computational Approach 


Charles C. Sims 


Rutgers University 


JOHN WILEY & SONS 


New York Chichester Brisbane Toronto Singapore 


Copyright © 1984, by John Wiley & Sons, Inc. 
All rights reserved. Published simultaneously in Canada. 


Reproduction or translation of any part of 

this work beyond that permitted by Sections 

107 and 108 of the 1976 United States Copyright 
Act without the permission of the copyright 
owner is unlawful. Requests for permission 

or further information should be addressed to 

the Permissions Department, John Wiley & Sons. 


Library of Congress Cataloging in Publication Data 


Sims, Charles C. 
Abstract algebra. 


Includes index. 

1. Algebra, Abstract—Data processing. I. Title. 
QA162.854 1984  512’.02’028542 83-6715 
ISBN 0-471-09846-9 


Printed in the United States of America 


100987654321 


In memory of my father, 
Ernest M. Sims 
(1883-1973) 


PREFACE 


This book is intended as a text for a one-year introductory course in abstract 
algebra in which algorithmic questions and computation are stressed. A sig- 
nificant amount of computer usage by students is anticipated. My decision 
to write the book grew out of my interest in group-theoretic algorithms and 
my observation that learning the definitions, the theorems, and even the 
proofs of algebra too often fails to equip students adequately to solve com- 
putational algebraic problems. The goals of the book are to: 


1 Introduce students to the basic concepts of algebra and to elementary 
results about them. 

2 Present the concept of an algorithm and to discuss certain fundamental 
algebraic algorithms. 

3 Show how computers can be used to solve algebraic problems and to 
provide a library, CLASSLIB, of computer programs with which stu- 
dents can investigate interesting computational questions in algebra. 

4 Describe the APL computer language to the extent needed to achieve 
the other goals. 


To help meet these goals, two additional manuals have been prepared, an 
instructor’s manual and a CLASSLIB user’s manual. There is more ma- 
terial here than can be covered in a one-year course. The instructor’s manual 
contains suggested course outlines, hints on how to use this book, and the 
answers to selected exercises, including all that involve APL. The CLASSLIB 
user’s manual contains detailed information about the library with com- 
plete listings of all programs. Generally, students should not need to acquire 
the user’s manual. However, anyone wishing to make extensive use of 
CLASSLIB will probably want to have a copy. Both manuals are available 
from the publisher. The library can be obtained in machine-readable form 
from the author. 

Computation in algebra is not really new. In some areas, such as num- 
ber theory, there is a tradition of hand calculation going back hundreds of 
years. However, the development of the digital computer has inspired new 
interest in the subject. More and more research effort is being devoted to the 


vil 


existence and efficiency of algebraic algorithms. In addition, the use of com- 
puters to solve problems in algebra is growing steadily. Many algebraic algo- 
rithms can be understood by students in an introductory course. 

The choice of the computer language to be used was very important. 
Of the languages normally available at college computer centers, only one is 
really suited for use in the teaching of algebra to students with little or no 
prior computing experience. That language is APL. The superiority of APL 
stems as much from the way the language is implemented as it does from the 
nature of the language itself. Here are some of the features of APL that 
make it the natural choice for this book. 


1 APL is implemented in an interactive mode. 

2 Arrays exist independent of programs. 

3 One-line statements can be entered and executed immediately. In ef- 
fect, beginning students do not have to write programs in the traditional 
sense. 

4 The language contains many powerful primitive operations for manip- 
ulating arrays that are very useful in describing algebraic algorithms. 


Even with the power of the APL language, most of the algorithms 
discussed are too complicated to be coded efficiently by beginning pro- 
grammers. Therefore | decided to provide a library of programs that would 
allow students to use the algorithms on nontrivial problems while developing 
their skill in using APL. Students should not need to acquire a separate APL 
text; the appendices provide an adequate introduction to the language. 

At this point, it would be good to mention several things that the book 
is not. It is not a text in applied algebra, which emphasizes the use of alge- 
braic techniques to solve problems that arise outside of mathematics. Neither 
is this a book on numerical linear algebra, which deals with the numerical 
analysis aspects of linear algebra over the real and complex fields. Although 
the difficulties of performing computer calculations with real and complex 
numbers are discussed, the emphasis is on exact computation. For this 
reason, many of the computational exercises in linear algebra involve the 
fields GF(—p), p a prime. 

There are two reasons for recommending that this book be used for a 
one-year course. First, the time required to introduce students to APL is 
too great to leave sufficient time to cover a reasonable amount of algebra 
in a one-semester course. In order to be able to understand and reproduce 
the dialogues in the text, one needs to know the material in Sections | to 3 
and 5 to 7 of Appendix | and Sections 2 and 3 of Appendix 2, as well as 
certain topics discussed in Sections Al.4 aud A2.1. Approximately three 
weeks are necessary to cover this material, and even more time must be 


vitl 


spent if significant original programming is to be required of students. The 
second reason for suggesting a one-year course is that the most interesting 
algorithms, at least to me, come in the second half of the course. All of the 
chapters contain computational topics, but it was my desire to describe the 
material in Sections 6.5, 7.4, and 8.2, which provided the main motivation 
for this book. There are several additional topics that I would have liked to 
include. Some, such as factorization in Z[.X] and a study of the ideals in 
Z[X], were omitted for lack of space. Others, such as some of the recent 
developments in computational group theory, could not be included because 
they involve concepts that do not fit easily into an introductory algebra 
course. Galois theory has been left out because practical algorithms for com- 
puting Galois groups are too involved to be presented at this level. 

In the text, the lemmas, theorems, and corollaries are numbered con- 
secutively within a section. Theorem 3 of Section 4 of Chapter 6 is re- 
ferred to as Theorem 4.3 in other sections of Chapter 6 and as Theorem 
6.4.3 outside of Chapter 6. A similar numbering system is used for ex- 
amples and for exercises. Names occurring in brackets are references to 
the bibliography. In the contents, sections marked with an asterisk may 
be omitted without affecting the logical development. Exercises of greater 
than average difficulty are also flagged with asterisks. The ends of proofs 
are marked with the symbol U. A LJ) at the end of a theorem indicates that 
the proof of that theorem will be omitted. 

Much of the writing of this book was done in college libraries. I am 
indebted to the library staffs at Rutgers University, Princeton University, 
Monmouth College, Southern Methodist University, and the Australian 
National University for the facilities they made available. Many individ- 
uals provided assistance throughout the eight years during which this book 
took shape. I wish to thank James England, Eugene Klotz, and Don Orth 
for many useful conversations concerning the use of APL in expostion. 
Michael O’Nan and Hale Trotter provided assistance on various mathe- 
matical topics. In particular, some of the material in Section 7.4 is based 
on a talk by Trotter. Certainly thanks are due to Kenneth Iverson. With- 
out his development of APL, this book, at least in its present form, would 
not have been possible. Finally, I wish to record my deep gratitude to my 
wife, Annette, who typed an early version of this text and then typed the 
entire manuscript into a homegrown word processor. Her assistance made 
the preparation of the book much easier than it would otherwise have been. 


Charles C. Sims 


Oe 


SIGNIFICANT DEPENDENCIES BETWEEN SECTIONS 


INTRODUCTION 


1 SETS 


I. 
2. 
3. 
#4, 
*S. 


Sets 

Relations 

Functions 

Sets of sets using APL 
Block designs and graphs 


2 THE INTEGERS 


I. 


2. 
3. 


4. 


*5. 


Divisibility 

Greatest common divisors 
Congruence 

Primes 

Multiple-precision arithmetic 


3 GROUPS 


Binary operations 
Groups 

Subgroups 
Homomorphisms 
Normal subgroups 
Direct products 
Permutations 
Permutation groups 
Graphs with a small number of vertices 
Conjugacy 

The Sylow Theorems 


XI 


CONTENTS 


XV 


4 RINGS 


Definition and examples 
Subrings and homomorphisms 
Computing in rings using APL 
Polynomial rings 
Matrix rings 
Determinants 
Units in matrix rings 
Fields of fractions 
Euclidean domains 

10. Factorization 
*11. Polynomial rings over UFD’s 
*12. Interpolation 


OO IDAARWNe 


5 MODULES 


1. Definitions 

2. Free modules 

3. Endomorphism rings 
4. Algebras 


6 MODULES OVER EUCLIDEAN DOMAINS 


1. Row equivalence 

2. Row equivalence, continued 

3. Vector spaces 

4. Solving linear systems using row operations 

5. Finitely generated modules 

6. Uniqueness of cyclic decompositions 

*7, Solving linear systems using row and column operations 

7 FIELDS 


1. Extension fields 
2. Splitting Fields 
*3. Finite fields 
*4. Factorization in Z, [X] 


8 LINEAR TRANSFORMATIONS 


1. Similarity 
2. Rational canonical form 
3. Eigenvalues and eigenvectors 


Xi 


146 


146 
154 
160 
166 
175 
18] 
19] 
206 
210 
216 
229 
234 


241 


241 
249 
260 
268 


281 


281 
292 
303 
31] 
323 
336 
343 


348 


348 
353 
357 
364 


373 


373 
380 
395 


APPENDIX 1 THE APL LANGUAGE 


A sample terminal session 
Arrays 

Primitive scalar operations 
Defined procedures 
Primitive mixed operations 
Reduction and scan 

Inner and outer products 
Some additional operations 


APPENDIX 2 APLSYSTEMS 

Editing 

System variables 

Workspaces and system commands 
Error messages 

Debugging 

Programming efficiency 


AAAWND 


APPENDIX 3 THE SUPPLEMENTAL WORKSPACES 


l. CLASSLIB 
2. EXAMPLES 


BIBLIOGRAPHY 


INDEX 


Xili 


405 


406 
410 
415 
423 
43] 
442 
445 
448 


456 


456 
459 
462 
466 
467 
470 


474 


474 
478 


480 


483 


SIGNIFICANT DEPENDENCIES BETWEEN SECTIONS 


3.7-3.8 


INTRODUCTION 


This book is an introduction to the area of mathematics called abstract 
algebra or, simply, algebra. It differs from most books on the subject in that 
computation plays a central role throughout. A substantial portion of the 
text is devoted to algorithms for solving algebraic problems. Loosely de- 
fined, an algorithm is a sequence of instructions for solving a particular 
problem or class of problems. The instructions must be unambiguous, 
with no room for different interpretations by different individuals, and must 
lead to the solution of the problem in a finite number of steps. The em- 
phasis here will be on algorithms that can be carried out, or executed, by 
a computer. 


It is difficult to establish a specific date for the beginning of any branch 
of mathematics. Nevertheless, it is widely agreed that the work of the 
French mathematician Evariste Galois (1811-1832) set the stage for the 
development of algebra into one of the major areas of mathematical ac- 
tivity. It was not until later in the nineteenth century, however, that ab- 
straction became an important part of algebra. Abstraction is the process by 
which similarities are recognized between apparently dissimilar mathema- 
tical objects and by which these similarities are shown to be consequences 
of a few basic properties (axioms) that are possessed by all of the objects 
being studied. It is to this process that the word “abstract” in the phrase 
“abstract algebra’’ refers. 


Introductory algebra courses are often referred to as courses on groups, 
rings, and fields. Algebra involves more than the study of groups, rings, 
and fields, but these three types of algebraic structures, together with 
one additional type, modules, form the subject matter of this text. The 
term “‘group’’ was coined by Galois, but the first formal definition was not 
given until 1849 and the value of the concept of an “‘abstract group”’ was 
not recognized for nearly 30 years more. The idea of a field is present in 
Galois’ work, but the term was introduced by the German mathematician 


1 


2 INTRODUCTION 


Richard Dedekind (1831-1916) and the definition was not standardized 
until late in the nineteenth century. Although many examples of rings 
were known in the nineteenth century, the abstract theory was developed 
during the present century. The term “ring” was formulated by David 
Hilbert (1862-1943), a very important German mathematician. 


The formal prerequisites for the study of abstract algebra are minimal. 
However, it will be assumed that readers are familiar with certain concepts 
normally covered in lower-level undergraduate mathematics courses. These 
concepts include proofs by induction and the elementary properties of 
sets, the integers, and rational numbers. Some acquaintance with real and 
complex numbers will also be assumed. 


In this text, as in any text on abstract algebra,.a considerable amount of 
space is devoted to the formal development of the subject. As axioms are 
stated, definitions made, and theorems proved, readers are encouraged to 
study particular examples in detail. It is only through the study of examples 
that one can see how the abstract theory provides an efficient method of 
deriving useful information about many different mathematical objects. 
The investigation of examples can be facilitated with the help of a com- 
puter. The computer makes it possible to look at more complicated and, 
one hopes, more interesting examples by removing the drudgery of time- 
consuming hand calculation. 


In order to communicate with a computer we must use a computer language. 
The language chosen for use in this book is APL. The APL language is ex- 
tremely powerful, which means that complicated calculations can be de- 
scribed with a few symbols. APL also possesses a high degree of internal 
consistency and, in many ways, is more logical than traditional mathe- 
matical notation. No prior knowledge of APL will be assumed. The ap- 
pendices contain a description of the aspects of APL that are important 
for using this text. Appendix 1 describes the APL language in sufficient 
detail to permit readers to follow the computer examples in this book. 
However, it is very important that readers be able to work out these and 
other examples on the computer. Appendix 2 contains further information 
about APL systems that can help readers use their local systems efficiently. 


In order to make the best possible use of this book, readers should have 
access to two APL workspaces that have been specifically created to sup- 
plement the text. The workspace CLASSLIB contains procedures for carry- 
ing out many types of algebraic computations. The workspace FXAMPLES 


INTRODUCTION 3 


contains arrays that represent various kinds of algebraic objects. The ar- 
rays in EXAMPLES are used in the computational examples in the text. A 
more complete description of the contents of these two workspaces can 
be found in Appendix 3. It is suggested that the naming conventions de- 
scribed in Section A3.1 be read before making extensive use of CUASSLIIB. 
All of the computer examples in the text assume that the contents of both 
CLASSLIB and EXAMPLES are present in the active workspace. 


The algorithms used in most of the procedures in CLASSLIB are discussed 
in the text. As each procedure is introduced, readers should concentrate on 
learning what it does and on understanding the basic algorithm involved. 
Once some familiarity with a procedure has been achieved, it is an extremely 
valuable exercise to write one’s own version of the procedure and compare 
it to the version in the library. I would appreciate being informed about 
possible improvements to the procedures in CLASSLIIB. 


Readers having some experience with APL may proceed immediately to 
Chapter 1. Those not familiar with APL should begin by reading through the 
first seven sections of Appendix 1. It is not necessary to become an expert 
in the use of APL before starting to learn algebra using this book. Once the 
fundamentals of the language have been grasped, the study of the real sub- 
ject matter—abstract algebra—should be begun. The appendices can then be 
used for reference, as needed. 


We will be using two different systems of symbolic notation, APL and tra- 
ditional, and it is important to be able to recognize which system is being 
used in a particular expression. APL expressions are printed in a special 
type font used only for APL. Thus C<4 ZGCD B is an APL expression. 
When any other type font is used, as in the statement “et c = gcd(a, b)’, 
traditional mathematical notation is assumed. Some care is required to 
distinguish between the two commas that occur in this book. The ordinary 
comma (,) is a mark of punctuation, but the APL comma (,) represents one 
of two APL operations that are described in Section A1.5. Occasionally, we 
will borrow certain aspects of APL notation for use with traditional nota- 
tion. For example, we will sometimes denote the entry in the ith row and 
jth column of the matrix A by A[i;j], even though A is not an APL array. 
These borrowings should not present any serious problems. 


SETS 


For nearly 100 years the formal exposition of mathematics has been based 
on the concept of a set. Readers are no doubt familiar with sets from previous 
courses in mathematics. In this chapter we will summarize the basic defini- 
tions, notation, and operations of set theory. We will also discuss ways of 
representing sets by APL arrays and techniques for manipulating these 
arrays to perform set-theoretic operations. The APL index origin, described 
in Section Al.2, is normally assumed to be 1. 


1. SETS 


One of the most important ideas in the development of mathematics is the 
use of the axiomatic method, in which all of the theorems in a particular 
branch of mathematics are obtained as logical consequences of a few axioms 
that state the basic properties that are assumed to hold for the objects under 
study. This approach is probably most familiar in the area of plane geom- 
etry. In an axiomatic treatment of plane geometry, no attempt is made to 
say what points and lines really are. Instead, one writes down axioms such 
as “‘through any two distinct points there passes eaactly one line’’ in an 
attempt to formalize our intuitive notions about the kinds of pictures we 
can draw using a straightedge and a very sharp pencil. The axiomatic method 
is universally agreed to be the proper approach to the study of abstract 
algebra. 

The idea of a set has been found to be of fundamental importance not 
only in algebra but in most of present-day mathematics. All of the alge- 
braic objects we will study will be sets. It would seem reasonable, there- 
fore, to begin our study of algebra with an axiomatic treatment of sets. 
This approach seems even more essential when we learn, as we will at the 
end of this section, that our intuition concerning sets can lead us to logical 
contradictions. However, we will follow the accepted practice in introduc- 
tory algebra texts and omit a formal treatment of sets. There are two reasons 
for this. First, the exposition of axiomatic set theory would delay too 
long our study of the main subject matter of this book: the basic properties 


4 


SETS 5 


of algebraic systems such as groups, rings, and fields and the algorithms for 
solving problems related to them. Second, our intuition leads us astray 
only when we try to consider sets that are “‘too big’’. The sets we will en- 
counter in our study of algebra will be ‘“‘small enough” that our intuition 
can be trusted not to get us into trouble. 

We define a set to be any collection of objects called the elements or 
members of the set. A commonly used synonym for “‘set’’ is “‘family’’. The 
word “‘group’’ should not be used as a synonym for “‘set’’ because a group 
in mathematical terminology is a particular kind of algebraic object that 
we will study in Chapter 3. If x is an element of the set XY, we write x e X 
and say that x belongs to X or that x is in X. If x is not an element of X, 
we write x ¢ X. Two sets are equal if they have the same elements. Thus 
the statement X = Y is equivalent to the following pair of assertions: 


1. Ifx eX, thenxe Y. 
2. Ifve Y,theny eX. 


To show that X and Y are not equal, we must exhibit an element of X 
that is not an element of Y or an element of Y that is not an element of X. 

There are two standard ways of describing a set. The first is to list 
the elements of the set separated by commas and enclosed in braces. Thus 


S= {1,2,3,5, 8, 13}, U= {2,3,5,7, 11}, 
A= {2,4, 8, 16}, B= {16, 8, 4, 2}, 
C= {2,4, 8, 16, 8, 4, 2} 


are all sets whose elements are positive integers. Since neither the order in 
which the elements are listed nor the fact that some elements are repeated 
has any significance, the sets A, B, and C are all equal. 

The easiest sets to represent in APL are finite sets of real numbers. 
Since the entries of an APL array are real numbers, we may simply define 
a vector whose components list the elements of the set. Thus, if 


S11 2 3 5 8 13 B<16 8 4 2 


U<2 3 5 7 11 C<2 4 8 16 8 4 2 
A<2 4 8 16 


then the vectors S, U, A, B, C correspond naturally to the definition of 
the sets S, U, A, B, C. (To save space, APL dialogues such as the preceding 
One are printed in two columns. At a terminal, they would appear as one 
long column.) 

The notation {a,, ...,4,} and the use of an APL vector to list the 
elements of a set both have the drawback that the representation for a par- 
ticular set is not unique. In general, there is no natural order on the ele- 


6 SETS 


ments of a set but, if the elements of the set are real numbers, we do get 
a unique representation if we assume the elements are listed in increasing 
order and without repetitions. The procedure SSORT in CLASSLIB pro- 
duces this standard list of the elements in the set described by an arbitrary 
vector. 


B C 
16 8 4 2 24 8 16 8 4 2 
SSORT B SSORT C 
24 8 16 24 8 16 


If X is an APL vector, we will often speak of “‘the set X instead of “‘the 
set represented by X’’. In particular, we will often refer to the set 14, which 
can, of course, mean either {1,2,...,W}or{0,1,..., W-1}, depending on 
the index origin. 

Sets whose elements are not real numbers are more difficult to repre- 
sent in APL. We will have to represent sets of sets of real numbers, sets of 
polynomials, and many other types of sets. Techniques for doing this will 
be discussed as the need arises. 

The second way to describe a set is to specify a property that charac- 
terizes the elements of that set. The statement 


X= {x|P&)} 


is read “‘X is the set of all x such that the property P holds for x.’’ For ex- 
ample, we may define two sets Z and M as follows: 


L = {x|x is a positive real number}, 
M = {t|t is an even integer}. 


There are a few sets that will come up so frequently that it is con- 
venient to have special symbols for them. The set of integers will be de- 
noted by Z and the set of positive integers or natural numbers by N. The 
symbols Q, R, and C will stand for the set of rational numbers, the set of 
real numbers, and the set of complex numbers, respectively. 

Suppose we define a set E by 


E={x|xeR, x? =—-l}. 


Since every real number has a nonnegative square, E has no elements. Such 
a set is said to be empty. The following theorem shows, among other things, 
that the set of all unicorns is equal to the set of all letters in the English 
alphabet that come after the letter Z. 


SETS 7 


THEOREM 1. Any two empty sets are equal. 


Proof. Let X and Y be empty sets. Since X is empty, we cannot find 
any element x of X, and so we certainly cannot produce an x in X that is 
not in Y. Similarly, we cannot find an element of Y that is not in X be- 
cause Y has no elements. Thus we are forced to conclude that the state- 
ment X # Y is false, so XY and Y must be equal. U 


By Theorem 1 we may speak of the empty set, since there is only 
one. It will be denoted by the symbol @. Whenever we define a property 
that a particular set may or may not have, it is a useful exercise to de- 
termine whether the empty set has the property. 

A set A is a subset of a set B if every element of A is also an element 
of B. In this case, we also say A is contained in B or that B contains A. 
A subset A of B is called a proper subset if A #B, that is, if there is some 
element of B that is not in A. We will write A C B when A is a subset of 
Band A C B when A is a proper subset of B. Some authors prefer to write 
A C B where we write A C B. The notation used here has been chosen be- 
cause it parallels the use of < and < to denote inequality of real numbers. 
The statements B D A and B D A mean A C B and A CB, respectively. 
We have ® C A and A C A for any set A. 

If the APL vectors 4 and B list the elements of two subsets A and 
B of R, then the assertion A C B corresponds to the APL proposition A/A €B. 
(An APL proposition is an APL expression with one entry, which is either 
1 or 0. The APL membership operation « is described in Section Al.5 
and the operation «/, called “‘and reduction’’, is discussed in Section A1.6.) 
For example, 


A<1 3 5 A/AeB 
B<«1 2 3 4 5 1 
C<2 35 7 A/AEeC 


Here A isa subset of B but not of C. 

For the time being, we will rely on our intuition concerning the term 
“finite set”, which will be defined in Section 3 of this chapter. If X isa 
finite set, then |X| is the cardinality of X, that is, the number of elements 
of X. 

Suppose A,,...,A, are sets of real numbers such that |4;| =m for 
1 <i<k. Wecan represent {A,,...,A,} by ak-by-m matrix A such that 
the Ith row ALI; ] of A lists the elements of A 7. For example, CLASSLIB 
contains a procedure SSUB such that A<«K SSUB WN defines A to be a 
matrix whose rows list the K-element subsets of 11. 


8 SETS 


UDIO<1 OTO0<0 
2 SSUB 4 4 SSUB 5 
1 2 O12 3 
1 3 O12 4 
2 3 O13 4 
1 4 023 4 
2 4 12 3 4 
3 4 HLO<1 


The number of rows of K SSUB N isthe binomial coefficient K ! VV. 

Given one or more sets, we can construct new sets in several ways. 
For example, if X is a set, we can form P= {A|A C X}, the set of all sub- 
sets of X¥. When X is finite, |P| = 2'*!. Because of this, we will use 2* to 
denote the set of all subsets of X for any set_X, finite or infinite. 

If X and Y are sets, then X U Y, the union of X and Y, is the set of 
elements that are members of at least one of the sets X¥ and Y. Thus 


XUY={xlxeX or xe VY}, 


where the ‘‘or’’ is inclusive ‘‘or’’. 
The intersection X |™ Y of X and Y is the set of all elements that are in 
both X and Y. Hence 


XO Y={x|xeX and xe Y}. 


Two sets are disjoint if their intersection is empty. The set {x|P(x)} NA 
is often written {x e A|P(x)}. 

The difference X — Y is the set of elements of X that are not ele- 
ments of Y. If Y is a subset of X, then X — Y is also called the complement 
of Y in X. 

Suppose the vectors X and Y list the elements of two subsets X and 
Y of R. Then it is quite easy to construct vectors that list the elements of 
the sets X U Y, X M Y, and X — Y. The vector X,Y lists the elements 
of X U Y. The [th component of the logical vector X«Y is 1 if and only 
if XLI] is equal to some component of Y. Thus the compression (X « Y )/X 
gives a list of the elements of X¥ NM Y. The 1’s in ~X « Y correspond to the 
components of X that are not equal to any component of Y, and so 


(~XeY )/X 
lists the elements of X — Y. 
X+1 23 4 (XeY)/X 
Y<2 4 5 2 4 
X,Y (~XeEY)/X 
1.2342 4 5 1 3 


SETS 9 


The operations U, M, and — on sets satisfy many useful properties. The 
most important are listed in the following theorem: 
THEOREM 2. Let A, B, and C be sets. Then 
(a) A~AUA=ANA=AUP. 
(b) AN G=Q. 
(c) AUB=BUAandANB=BOA. 
(d) The following are equivalent: 
(1) ACB. 
(2) AUB=B. 
(3) ANB=A. 
(4) A-B=@. 
(e) AUB)UC=AUBUC). 
(f) (ANB)NC=AN(BNC). 
(g) AN(BUQO=ANB)UCANC). 
(h) AU(BNOC)=AVB)N(AUVUC). 
(i) (C—-A)U(C—B)=C-—-(ANB). 
jj) (C—-A)N(C—B)=C-— (A UB). 
(k) IfA CBandBCC,thenA CC. 
Proof. We will leave most of the proof as an exercise, proving only 
part (i) as an illustration. First, let x be an element of (C — A) U (C — B). 
Then x isin C—A orinC — B. If x isin C —A, thenxeCandx€ANB 
and so x isin C— (4 NB). Similarly, if x isin C — B, then x isinC—(A NB). 
Thus, in either case, x isin C — (A M B), and so 
(C—A)U(C-—B)CC- (ANB). 
Now suppose x is in C— (A NB). Then x is in Cand x is notin A M B. Thus 
x is either not in A or not in B. But this says x isin either C— A orC—B, 
and so x is in(C — A) U(C — B). Hence 
(C—A)U(C—B)DC—(A NB). 
Combining this with the previous result, we conclude that 
(C—A)U(C—B)=C-(ANB).U 
Let X and Y be sets. The Cartesian product X X Y of X and Y is the 
set of all ordered pairs (x, y) with x e X and y e Y. [The term “‘Cartesian’”’ 
is derived from the name of the French mathematician and philosopher René 
Descartes (1596-1650).] The diagonal of X X X is the set {(x, x)|x e€ X}. 
A partition of a set X is a set II of subsets of X or, equivalently, a sub- 
set of 2* , such that 
1. IfA ell, then A #@. 
2. If A eIland Be Hl, then either A =BorANB=Q@. 
3. Every element of X is in some element of II. 


10 SETS 


Rephrasing, we can say that a partition of X is a family of nonempty subsets 
of X such that each element of X belongs to exactly one member of the 
family. The elements of the partition are called blocks. For example, 
{{1},{2, 3}, (4,5, 6}, {7, 8, 9, 10 }} 

is a partition of {1, 2,..., 10} with four blocks. For any proper nonempty 
subset A of a set X, the set {A, X — A} isa partition of X. A subset R of 
X is called a set of representatives for a partition II of X if R contains 
exactly one element from each block of II. 

We close this section with an illustration of the ways our intuition 
can get us into trouble when we try to use sets that are “‘too big’’. It is pos- 
sible to imagine sets that are members of themselves, such as the set of all 
sets or the set of all abstract concepts. Thus we can reasonably form 


K = {X|X isa set and X ¢X}, 


the set of all sets that are not members of themselves. Now we ask the 
question whether or not K is a member of itself. If K ¢ K, then K is one of 
the sets that our definition says must be in K, and so K e K. Butif K eK, 
then K cannot be one of the members of K, and so K ¢ K. Thus it seems that 
the statements K ¢ K and K e K must both be false, since they each lead to 
a contradiction. This antinomy or paradox is due to the English mathema- 
tician Bertrand Russell (1872-1970). It and others like it hada great influ- 
ence on the development of axiomatic set theory. 


EXERCISES 


1 What is a reasonable interpretation for {1,2,...,n}whenn =0? 

2 Let X be a set. Explain the difference between XY and {X}. What 
are || and | {Q}|? 

3 Let X and Y be finite sets. Show 

(a) |2* | = 2!* 1, 

(b) XU Y| +|XN Y| = LX| + [VY] 

(c) IX X Y| = [XI XIYL 

What is 29? 

Complete the proof of Theorem 2. 


Determine the number of partitions of a set XY when 1 < |X| < 4. 


Suppose II, and II, are partitions of a set X. We say II, is a refine- 
ment of II, if, whenever A e II, and Be II,,theneither A C B or 
AN B=®@. Let X = {1,2,...,10} and let II be the partition 


{{1}, {2,3}, {4,5, 6}, {7, 8,9, 10 }} 


10 


SETS 11 


of X. How many partitions of X are refinements of II? For how 
many partitions II, of X is I] a refinement of II,? 

How many partitions of the empty set are there? 

We stated that all algebraic objects studied in this text would be 
sets, but we did not define the ordered pair (x, y) as a set. Show 
that if we define (x, y) to be {{x}, {x, vy }}, then (x, y) = (u, v) 
if and only ifx =u and y =». 


Let X, Y, and Z be sets. Are X X (Y X Z) and (X X Y) X Z the 
same set? 


In Exercises 11 to 15 A, B, and C are APL vectors listing the elements of 
subsets A, B, and C of R, respectively. 


11 


12 


13 


14 


15 


16 


Write APL propositions corresponding to each of the following 
assertions. Check your answers at a terminal with specific ex- 
amples. 


(a) BCC (sg) AFB. 

(b) ACB. (h) 7eA. 

(c) A=C (i) BCZ. 

(d) C=AUB. (j) {3,5,93CC¢. 
(e-) B=ANC (k) |Al=ICl. 
(f) BCA-C (1) IBI<S. 


Write APL expressions defining vectors that list the elements of the 
following sets. Check your answers at a terminal. 


(a) ANC (d) AU(BNO. 
(b) BUC (e) {aeAla 26}. 
(c) C-(ANB). (f) ANZ. 


Why might it be a good practice to use SSORT A,B to represent 
A UB instead of using A,B? 

Write APL procedures SETEQ, SETUN, SETINT, and SETDIFF 
such that A SETEQ BislifA=BandOotherwiseand A SETUN B, 
A SETINT B,andA SETDIFF B are vectors listing the elements 
of AUB,A MB, and A — B, respectively. 


Verify assertions (a) to (c) and (e) to (j) of Theorem 2 at a terminal 
for several choices of the sets A, B, and C. The procedures defined 
in Exercise 14 may be used to construct representations for the 
sets involved and to test equality. 


Let n be a positive integer. What would be a good way to represent 
a partition of {l,...,m} by an APL array? Given representations 


12 SETS 


of two partitions II, and II,, how could you decide whether II, = 
Il, or whether II, is a refinement of II,? 


2. RELATIONS 


A relation from a set X to a set Y is a subset R of the Cartesian product 
X X Y. A relation on X is a relation from X to X. In this context it is con- 
ventional to replace the statement (x, y)e R by xRy. For example, when we 
speak of the relation >, “‘is greater than”, on R, we mean 


{~%,y)|x,yeR and x>y}, 
and by the relation C, “‘is contained in’’, on 2*, we mean 
{A,B)|ACBCKX}. 


On any set X we have the identity relation, the diagonal of XY X X, in which 
two elements of X are related if and only if they are the same. The relation 
X X X itself is the trivial relation on X and @ is the empty relation. Rela- 
tions occur in everyday speech. Thus “‘is a child of’’, “has a common an- 
cestor with”, and “‘lives on the same street as’’ can be considered relations 
in our sense on the set of all people living today when they are interpreted 
as describing certain sets of ordered pairs of people. 

Let R and S be relations from X to Y.ThenRUS,ROS, and 
(X X Y) — R are also relations from X to Y. If R C S, we say R implies S. 
Thus < implies <, and “is a son of’’ implies “‘is a child of’’. The set 


R* = {(,x)|@,y) eR} 


is a relation from Y to X and is called the inverse of R. Clearly, (R=!) 1! =R 
and R implies S if and only if R™ implies S. The inverse of > is<, and the 
inverse of “‘is a child of”’ is “is a parent of”’. 

If X is the set 1.Y¥ and Y is the set 1N, then a relation R from X to Y can 
be conveniently represented by the /-by-V matrix FR such that RLI;/7] is 1 if 
the pair (I,J) is in R and O otherwise. Such matrices can often be con- 
structed using APL outer products. Recall that if fis any APL dyadic scalar 
operation and if U and V are vectors, then Uo.fV is the matrix whose J,/J-th 
entry is ULI Jf VL JJ. In the examples 


O<«S<(15)0.<15 O<+7<12/(14)°.-16 
OoOt1i11 110000 
Ooo0oi21 11%41000 
000121 0114100 
00001 001110 
00000 


RELATIONS 13 


S represents the relation < on 15 and 7 represents “‘differs in absolute value 
by at most 1 from’’, considered as a relation from 14 to 16. 

If F is the logical matrix representing a relation R from 1 to 1N, 
then Ff will be called the characteristic matrix for R. The reason for this 
terminology will be discussed in Section 3. 

If R is a relation from X to Y and S is a relation from Y to Z, then 
the composition of R and S is the relation from X to Z consisting of those 
pairs (x, z) for which there is an element y in Y such that (x, y) e R and 
(vy, z) e S. For example, person x is a grandchild of person z if and only if 
there is a person y such that x is a child of y and y isa child of z. Thus the 
relation “is a grandchild of” is the composition of the relation “‘is a child 
of”? with itself. The composition of the relations “‘is at least 2 more than”’ 
and “‘is at least 3 more than’”’ on R is the relation “‘is at least 5 more than’’, 
since for real number x and z the assertion x = z + 5 is equivalent to the 
assertion that there is a real number y such that x>y +2 andy2zZzt3. 
The composition of R and S is denoted RoS. 

Suppose now that X is 1, Y is 1M, and Z is 1N and that R is described 
by the matrix R and S by the matrix S. How can we construct the matrix 
T representing Ro S? The entry TLI;X1J] is | if and only if some J in 1M 
the entries RLI;J] and SL.J;KJ] are both 1. This is equivalent to saying 
that the row RL I; J] and the column SL ;2J1] have a 1 in the same position. 
Thus 7LI;K]is 


V/RLIsJASL3K]. 


But this means that 7 is the inner product Rv. aS. (APL inner products 
are discussed in Section A1.7.) Consider the following example. 


U<+R<(16)°.>16 O<7T<RV.AR 
000 0 0 0 000 0 0 0 
100 0 0 0 000 0 0 0 
1100 0 0 1000 0 0 
113100 0 1100 0 0 
113110 0 11100 0 
1131311 +0 113131 0 0 


Here F& represents > on 16 and 7 represents the composition of > with 
itself, in other words, the relation “‘is at least two greater than”’. 


THEOREM 1. Let R be a relation from X to Y, S a relation from 
Y to Z, and J arelation from Z to W. Then 


(a) (RoS)oT=Ro(So7). 
(b) (RoS)t =S7 oR, 


14 SETS 


Proof. We note first that RoS isa relation from X to Z,,and so 
(RoS)oT is defined and is a relation from X to W. Similarly, Ro(SoT) is 
defined and is also a relation from X to W. Suppose now that (x, w) is in 
(Ro S)oT. Thus there exists z in Z with (x, z) in RoS and (z, w) in T. But 
therefore there exists y in Y with (x, y) in R and (y, z) in S. This implies 
that (vy, w) is in SoJT, and so (x, w) is in Ro(SoJ). We have now shown 


(RoS)oT C Ro(SoT). 


The proof of the reverse containment and the proof of part (b) are left as 
exercises. J 


Part (a) of Theorem 1 states that composition of relations is associ- 
ative, and part (b) states that the inverse of the composition of two rela- 
tions is the composition of the inverses in the opposite order. 

There are three important properties that a relation R on a set X 
may have. We say R is reflexive if xRx for all x in X. This is the same as 
requiring that R contain the diagonal of X X X or that the identity rela- 
tion imply R. The relation ” is symmetric if whenever xRy, then yRx. 
Equivalently, R is symmetric 1. R implies R. But if RC R™, then 


Rt C(RT)7 = R, 


and so R = R7™. Thus R is symmetric if and only if R is equal to its in- 
verse. We say R is transitive if whenever xRy and yRz, then xRz. Since there 
exists a y in X with xRy and yRz if and only if (x, z) isin RoR, we see 
that A is transitive if and only if RoR implies R. A relation that is reflex- 
ive, symmetric, and transitive is called an equivalence relation. We will see 
equivalence relations quite often in our study of algebra. 

On any set X the identity relation and the trivial relation are equiv- 
alence relations. On R the relation < is transitive but not symmetric or 
reflexive, while < is reflexive and transitive but not symmetric. The rela- 
tion “‘has a common parent with” is reflexive and symmetric but not transi- 
tive. 

Suppose Ff is an JN-by-WV matrix describing a relation R on iW. The 
main diagonal of fF is the vector 1 1A. (The dyadic transpose operation 
& is explained in Section A1.5. In origin O the diagonal of RF is 0 OQ.) 
Since R is reflexive if and only if the diagonal entries of F are all 1, the 
proposition 4/1 1&RF corresponds to the assertion that R is reflexive. We 
leave it to the reader (see Exercise 18) to show that the propositions 
A/,R=8R and A/,R>RV.AR correspond to the assertions that R is sym- 
metric and transitive, respectively. Let us look at a simple example. 


RELATIONS 15 


U+E<+4 401 001014%100141201001 


1001 
O11 0 
O11 0 
1001 
A/1 12 
1 
A/,H=QE 
1 
A/,HZEV.AE 
1 


Here we see that the relation on 14 represented by £ is an equivalence 
relation. 

The procedure SEQREL in CLASSLIB checks whether a logical matrix 
defines an equivalence relation. 


SEQREL E 
1 


The method used in SEQREL is more efficient than just checking in turn 
for reflexivity, symmetry, and transitivity. It is based on the ideas of The- 
orem 2, which follows. (See also Exercise 21 .) 

We will now show that there is a close connection between equivalence 
relations on a set X and partitions of X. For any relation R from X to Y and 
any subset A of _X, let 

AR={yeY|aRy forsome aeA}. 
Thus AR is the set of all elements of Y to which R relates at least one 
element of A. For example, if R is the relation “‘is a child of’? and A isa 
set of people, then AR is the set of all people y such that some person 
ain A isa child of y. In other words, AR is the set of parents of the people 
in A, 
THEOREM 2. Let £ be an equivalence relation on XY. Then 


Il = {{x}E |xeX} 
is a partition of X. 


Proof. Since £ is reflexive, for all x in X we know that x isin {x }E-. 
Thus every element of X is contained in some member of II and all the 
members of II are nonempty. Suppose that an element x of X is contained 
in both {y}£ and {z}E. We must show {y}E = {z}E. Suppose w is in 
{y}£. Then we have yEx and zEx and also yEw. By the symmetry of FE 
this implies xEy. Using the transitivity of E twice, we get zEy and hence 


16 SETS 


zEw. Therefore w is in {z}£. This shows that {y}£ C {z}E. By a similar 
argument we obtain {z}£ C {y}E£ and so {y}E = {z}E. Hence every ele- 
ment of X is contained in a unique member of IT. 0 

The partition II of Theorem 2 is sometimes written X/E. Elements of 
II are called equivalence classes of E. 

It is also true that given a partition II of a set X we can construct an 
equivalence relation on X. Define the relation Ey, on X so that xEqy if 
and only if x and y are in the same block of II. 


THEOREM 3. Let II be a partition of the set X and let R be an equiv- 
alence relation on X. Then 


(a) Ey is an equivalence relation on X and II = X/Eq. 
(b) R=Eyp. 
Proof. The proof is left as an exercise. [ 


EXERCISES 


1 Complete the proof of Theorem 1. 


2 Suppose |X| = 2. How many relations on X are there? How many 
of these are reflexive? How many are symmetric? How many are 
transitive? 


3 Find the error in the following ‘‘proof’”’ that a symmetric, transitive 
relation R on a set X is automatically reflexive. For any x in X 
we can conclude by symmetry from xRy that yRx and then by 
transitivity that xRx. 


4. Is the empty relation on a set reflexive? Is it symmetric or transi- 
tive? 

5 Let R be a relation from X to Y and let A and B be subsets of 
X. Show 
(a) IfA CB, then AR C BR. 
(b) (4A UB)R = (AR) U (BR). 
(c) (A ON B)R C (AR) ON (BR). 


Give an example showing that equality does not always hold in 
part (c). 


6. Let R be the relation < on R and let S=R 1 (Z X Z) be the cor- 
responding relation on Z. Show that RoR=R but SoS#S. 
7 Let R bea reflexive relation on a set X¥. Show that RC RoR. 


Prove that if S and T are symmetric relations on a set X, then so 
areSUT,SOT,and SoS. 


10 


11 


12 
13 


14 


15 


16 


17 


18 


19 


RELATIONS 17 


A relation on a set may or may not be reflexive, it may or may 
not be symmetric, and it may or may not be transitive. Show 
that all eight possibilities can occur by giving examples of everyday 
relations illustrating each type. 


Let E and F be equivalence relations on a set X. Show that ENF 
is an equivalence relation but that, in general, F U F is not. 


Is the converse of Theorem 2 true? That is, if E is a relation on 
X such that {{x}E|x e X} is a partition of X, does it necessarily 
follow that £ is an equivalence relation? 

Prove Theorem 3. 


Suppose X is the set 1” and Y is the set 1. Given that R and S 
are M-by-N matrices describing relations R and S from X to Y, 
write APL expressions for the matrices representing the foliowing 
relations. 

(a) RUS. (c) (XXY)—R. 

(bb) RAS. (d) Rt. 


Let R, S, R, S be as in Exercise 13. Write APL propositions equiv- 

alent to the following statements. 

(a) R implies S. 

(b) If R does not hold, then S holds. [By this we mean that if 
xeX,yeY,and (x,y) éR,then@, y)eS.] 

(c) R and S never hold simultaneously. [That is, there is no pair 
(x, vy) with (x, vy) eR and (x, y)eS.] 


Write APL expressions defining the matrices representing the fol- 
lowing relations on 110. 


(a) = (d) > 
(b) # (e) “‘is at least 2 more than”’ 
(c) < (f) “equals half of” 


Explain how one could verify Theorem 1 in particular cases at 
a terminal. 


Show that the J-by-W logical matrix Ff defines a reflexive relation 
on 1Vif and only if A/,R2>(i1NV)°.=1W. 

Prove the validity of the APL formulations of symmetry and transi- 
tivity given in the text. 


Describe how the random number generator in an APL terminal 
system may be used to construct “random”’ logical matrices, that 
is, matrices R such that the entries RLI;/J] are chosen independ- 


18 SETS 


ently from the set {0, 1} with each value having a probability of 
% of being chosen. 


20 + Using the technique of Exercise 19, construct 10 random, logical, 
10-by-10 matrices 2. How many times is RV. AR equal to 10 101? 
Explain your answer. 


*21 Let Ff be an J-by-N logical matrix. To compute Av. AF in the natural 
way requires a time proportional to W*3. Show that it is possible 
to decide whether or not # defines an equivalence relation on 
1NV in a time proportional to V*2. 


22 Let X and Y be finite subsets of R. How could we represent a 
relation from X to Y by one or more APL arrays? 


23 The workspace EXAMPLES contains a 25-by-25 logical matrix £25. 
If one entry of #25 is changed, then £25 becomes the character- 
istic matrix of an equivalence relation on 125. Which entry should 
be changed? 


24 + Let #be the characteristic matrix for an equivalence relation FE on 
iV and let U be a vector with distinct components listing the ele- 
ments of a subset U of 1. Write an APL proposition corresponding 
to the assertion that U is a set of representatives for the set of 
equivalence classes of E. 


3. FUNCTIONS 


A function from a set X to a set Y is a relation f from X to Y such that for 
each x in X the set {x}fhas exactly one element. The statement ‘‘f is a func- 
tion from X to Y” is written symbolically f:X—~Y or X-L+Y. The set ¥ 
is called the domain of f, and Y is called the codomain or range of f. If 
{x}f = {y}, then y is called the image of x under f, and we write y = xf 
or f:xt-y. The words ‘‘map’’ and “‘mapping’’ are synonyms for ‘‘func- 
tion”. Words such as ‘“‘operator’’ and “‘transformation” are also used to re- 
fer to certain kinds of functions. The statement f:x ly can be read “f 
maps x to y.”’ The set of all functions from X to Y is denoted Y*. Exercise 
3 explains why this notation was chosen. 

Mathematicians have developed a great many different notations for 
representing the image of an element x under a function f. We have in- 
troduced the notation xf, but fx, f(x), f,, and xf are also widely used. 
Other notations are possible when the symbol representing the function 
is something other than a letter. For example, X and x’ could be the images 
of x under two functions * and’, respectively. Analysts seem to favor writing 
the name of the function on the left, as in the familiar calculus expression 
y = f(x). Algebraists, on the other hand, tend to prefer writing the name 


FUNCTIONS 19 


of the function on the right. Each convention has advantages as well as 
disadvantages. Faced with this abundance of notational possibilities, stu- 
dents of mathematics have little choice but to learn to be flexible and to 
adopt the conventions of the particular branch of mathematics they are 
studying. In this book we are torn between the preference of algebraists 
for writing the symbol for a function on the right and the rule in APL that 
requires the symbol for a monadic function to be placed to the left of its 
argument. The decision has been made to let these two notational con- 
ventions coexist, peacefully it is hoped. The wisdom of this decision will 
have to be judged by the reader. 
Let us consider now some examples of functions. 


Example J. On any set X the identity relation e is a function, the identity 
function. For any x in X we have xe =x. 


Example 2. If X and Y are sets, then the projection of X X Y onto X is the 
map taking (x, y) to x. The projection onto Y is similarly defined. 


Example 3. A vector of length WV is a function whose domain is 1. Whether 
we use traditional notation, as in vy = (2, 4, 6), or APL notation, V« 2 4 6, 
the ith component of a vector is the image of i under the vector. 


Example 4. The most general definition of a matrix is a function whose 
domain is a Cartesian product X X Y. Normally, we consider only the 
case in which X is 1” and Y is iW for some positive integers M and JW. If 
A is a matrix, then the image of the pair (, j) under A is written A,; or, 
in APL notation, A[i;j]. Some authors use the statement A = [a;;] to as- 
sert that A is a matrix whose ijth entry is a;;. We will not use statements 
of this type here. 


Example 5. For any set X we may define a function g mapping X to 2* 
by xg = {x}. 
Example 6. Let II be a partition of the set XY. For each x in X, let z(x) 


be the block of II containing x. Then 7 is called the natural map from 
X to Il. 


Example 7. A binary operation on a set X is a function from X X X to 
X. The ordinary arithmetic operations +, —, and X are binary operations 
on R. The operations U and M may be considered to be binary operations 
on 2* for any set X. If fis a binary operation on X, then the image (x, y)f 
of the ordered pair (x, y) is often written xfy. For example, x + y and 
x X y are the images of (x, y) under the binary operations + and  X, re- 
spectively. In fact, when the operation is clearly understood, we may even 
write xy, as is often done in the case of multiplication. (Of course, the 
omission of the symbol for any function is strictly prohibited in APL.) 


20 SETS 


It is convenient to use the symbol e to denote a typical binary operation 
ona set X, with x e y denoting the image of (x, y) under e. 


Using the extended definition of a matrix given in Example 4, we may 
say that a binary operation on X is a matrix whose rows and columns are 
indexed by X and whose entries lie in XY. Thus a binary operation on 1 
is simply an N-by- matrix with entries in 1. For example, if 


LI<MAX<(14)o.f 14 


1234 
223 4 
3334 
hou uy 4y 


then MAX is the binary operation on 14 defined by the maximum oper- 
ation [. 

There are two important properties a binary operation e on a set 
X may possess. If x and y are in X and x e y = y @ x, we say x and y com- 
mute. If every pair of elements in X commutes, e is called a commutative 
binary operation. If x e (ye z)=(x« e y)e z, then x, y, and z associate, 
and if every triple of elements in X associates, we say @ is an associative 
operation. On R the operations + and X are commutative and associative, 
but — is neither. 


Example 8. Let X be a fixed set. Each subset A of X determines a func- 
tion f, from X to {0, 1}. If x is in X, then xf, is defined to be 1 ifxeA 
and O if x ¢ A. The function f, is called the characteristic function of A. If 
BCX, then A = B if and only if f, = fp. It is in this sense that {, charac- 
terizes A. 


Characteristic functions provide another method of representing 
subsets of 11. If A is a subset of 1, then the characteristic function of 
A is the logical vector X such that XLZ] is 1 if and only if J isin A. We will 
refer to X as the characteristic vector of A. The basic set theoretic oper- 
ations can easily be performed using characteristic vectors. For example, 
if Y is the characteristic vector for another subset B of iN, then XVY is 
the characteristic vector for A U B. The formulation of the characteristic 
vectors for A 1 B, A — B, and the complement of A in 1¥ is left as an 
exercise. 

In Section 1 we described how to represent a set of sets of real num- 
bers by a matrix whose ith row lists the elements of the ith set. For ex- 
ample, the matrix 


FUNCTIONS 21 


LI<A<2 SSUB 4 


MWPNMORPB 
FEWWhN 


3.4 


represents the six two-element subsets of {1l, 2, 3, 4}. This method of 
representation is best suited to the case in which all of the sets have the 
same number of elements. Characteristic vectors can be used as an alter- 
native for representing sets of sets, particularly subsets of 14, which easily 
handles sets of different cardinalities. 

Suppose A,,..., Ax are subsets of 1. We can represent {A,,...,Az} 
by the matrix C whose Jth row is the characteristic vector of A 7. The pro- 
cedure SCV computes this representation for the family of sets listed in 
the rows of a matrix. 


U<C<4 SCHV A 


1 10 0 
1 01 0 
O11 0 
1 00 1 
O10 1 
0 01 1 


The first argument of SCHV is an integer WV such that all the sets are subsets 
of 1X. 


UI0<0 
LI<D<«5 SCHV A 
0 0 


S2O000C0 
OORORR 
OFROROR 


MFOoOORR 
BRR OO 


Let us now take another look at our definition of the term “‘func- 
tion”. We have said that a function f from X to Y is a particular kind of 
relation from X to Y, that is, a subset of X X Y. Suppose we are given f in 
just this form, as a set of ordered pairs. Is it possible to determine X and Y 
from this information? Our definition says that for every x in X there is 
exactly one element y of Y such that (, y) is in f. Thus X is the set of 
first components of the elements of f. Therefore the domain X is uniquely 


22 SETS 


determined. However, the codomain Y is not unique. The set of second 
components of the elements of f is the image Xf of f, which is a subset 
of Y, but Y may be any set containing Xf. For example, if 


f= {(m, m* + 1) |meZ}, 


then, according to our definition, f:Z——>Z, f:Z—Q, and f: Z—>R. 

In some areas of mathematics it is convenient to change the definition 
so that a function F from X to Y determines both X and Y uniquely. This 
is done by defining the function F to be a triple (X, Y, f), where X and Y 
are sets and f is a function in our sense from X to Y. In this context f is 
called the graph of F. We will not adopt this somewhat more cumbersome, 
although technically superior, definition of a function. The reader is warned, 
however, that a few of the definitions we will give concerning functions 
assume that the codomains as well as the domains are specified. 

If f:X—-Y and A C X, we can construct a function h from A to Y 
by restricting the domain in the following sense. We seth =/M(A X Y) and 
refer to h as the restriction of f to A, writing h = f la. The elements of h 
are the ordered pairs in f whose first components lie in A. 


THEOREM 1. Let f:X-—~Y and g:Y—>Z and suppose A C X. Then 
(a) fog isa function from X to Z. 
(b) fla is a function from A to Y. 


Proof. (a) The composition fog is certainly a relation from X to 
Z.We must show fog is, in fact, a function. Suppose (x, z,) and (x, Z,) 
are both in fog. Then there exist y, and y, in Y such that (x, y;) ef 
and (y;, Z;) € g fori = 1, 2. However, since f and g are functions, y,;,=xf=y, 
and, therefore, Zz; = y,g = yg = Z,. Thus x is related to at most one ele- 
ment of Z by fog. For any x in X we have (x, xf) in f and (xf, (xfg) in 
g, and so (x, (xf)g) is in fog. Hence x is related to exactly one element 
of Z and fog is a function from X to Z. 


The proof of part (b) is left as an exercise. U 


A function from 1Z to i is simply a vector F of length Z whose 


components lie in 1. If G is a vector that is a function from 1” to 1V, 
then the composition of F and G is the vector #+GLF]. 


UlO<«1 GLF] 
F<1 3 5 2 6 10 
G+2 4 6 8 10 


If f:X—~Y and A C X, then the set Af defined in Section 2 can be 
written {af|aeA}. We say that f is surjective or that f maps X onto Y if 
Xf = Y. This means that for each y in Y there is at least one x in X such 
that y = xf. Note that the definition of the term “‘surjective’’ requires that 


FUNCTIONS 23 


the codomain Y be specified. We say that f is injective or one to one (ab- 
breviated 1—1) if each element of Y is the image of at most one element of 
X. In general, the inverse f™ of f is a relation from Y to_X, but not a func- 
tion. If B is a subset of Y, then Bf is called the inverse image of B under 
f. It is clear that f is injective if and only if |{y}f7?|< lforallyin Y. If 
f is both surjective and injective, then f is said to be bijective. A surjection, 
injection, or bijection is a function that is surjective, injective, or bijective, 
respectively. Bijections are also called 1—1 correspondences. The inverse of 
a bijection from X to Y is not only a function but a bijection from Y to X. 
A bijection from X to itself is called a permutation of X. The set of all 
permutations of the set X will be denoted Y(X). In the special case that 
X is 1N we write 27 for (CX). 

THEOREM 2. Suppose f:X—Y and g:Y-~Z. Then 

(a) If fand g are surjective, then so is fog. 

(b) If fand g are injective, then so is fog. 

(c) If fog is surjective, then g is surjective. 

(d) If fog is injective, then fis injective. 

Proof. We will prove only part (a), leaving the rest to the reader. Let 
z eZ. Then, since g maps Y onto Z, there is an element y in Y with yg =z. 
Because f is surjective, there is an element x in X with xf=y. Then x(fog) = 
z, SO fog is surjective. U 

COROLLARY 3. If f and g are permutations of the set X, then so are 
fogandf?. Q 

COROLLARY 4. Suppose f:X—>Y and g:Y— X. Assume that fog is 
the identity function on X and go fis the identity function on Y. Then f and 
g are bijections and g=f"!. 

Proof. Since the identity function on a set is a bijection, we have in 
particular that fog is injective and gof is surjective. By parts (c) and (d) 
of Theorem 2 we see that f is both injective and surjective and hence is a 
bijection. If xf = y, then yg =(xf\g = x(fog) =x =yf?. Thusg=f7. O 

As an application of Corollary 4, we complete our discussion of the 
connection between equivalence relations and partitions begun in Section 
2. For any set X, let Eq(X) denote the set of equivalence relations on X 
and let Part(X) be the set of partitions of X. 


THEOREM 5. For any set X there exists a bijection from Eq(X) to 
Part(X). 

Proof. Define f:Eq(X)—Part(X) and g:Part(X)—Eq(X) by f:EF’X/E 
and g:II--Ey. By Theorem 2.3, fog is the identity on Eq(X) and gof 
is the identity on Part(X). By Corollary 4, fis a bijection. OU 


Let X, Y, and Z be sets An element of (X X Y) X Z has the form 


24 SETS 


((x, vy), Z), and an element of X X (Y X Z) has the form (, (y, z)). The 
function that maps ((x, y),z) to (x,(y,z)) is clearly a bijection from (X X Y) X 
Z to X X(Y X Z). When such an obvious bijection exists from one set to 
another, we will often identify the two sets, considering them to be the same 
even though, strictly speaking, they are different. Thus we will normally 
write XX YXZ without parentheses and denote a typical element by (x,y,z). 

We have been working informally with the notion of a finite set. It 
is now time to make this concept precise. A set X is finite if, whenever 
f:X—X is injective, then f is surjective. The map x ->x+l of N into itself 
is injective but not surjective. Thus, by our definition, N is not finite, that 
is, it is infinite. The connection between this definition of finiteness and 
the idea of a set with a finite number of elements is given by the follow- 
ing theorem. 


THEOREM 6. A set X is finite if and only if there exists a nonnegative 
integer n and a bijection from {1, 2,...,n} to X. The integer 7 is unique. 


Although this theorem is intuitively very natural and perhaps even 
“obvious”’, its formal proof requires a fairly large number of steps that 
do not significantly increase our basic understanding of the underlying 
concepts. For this reason we omit the proof. U 

THEOREM 7. Let X be a finite set and suppose f:X—~X is surjective. 
Then f is injective. 

Proof. For each x in X choose one element y of X such that x = yf 
and define g:X—>X by g:xb >y. If x,g = x2g, then x, = (x,2g)f = (x.8)f = 
x, and so g is injective. Since X is finite, g must also be surjective. Now sup- 
pose f is not injective. Then, for some x, # x, in X, we have x,f=x,f. 


Since g is surjective, there exist uw, and u, in X such that u,g=x, andu,g = 
x,. Thus 


Uy =(U,g)f Hx fHxef= Uzg)f=uy. 
Hence uw, =u, which is impossible, sinceu,g#u jg. O 
Theorem 7 shows that the finite sets are precisely those sets X for 


which the notions of injectivity and surjectivity coincide for maps from 
X to X. This is the most important property of finite sets. 

Suppose for each element i of a nonempty set J we have a set A;. We 
can think of A as a function mapping the element i in / to the set A;. Such 


a function is often referred to as a family of sets indexed by I. Given such 
a family, we define 


LU) A; 


lel 


to be the set of elements contained in at least one A; and 


FUNCTIONS 25 
() A; 
iel 


to be the set of elements contained in every one of the A;. We say the A; 
are pairwise disjoint if A; \ A; = ) whenever i #j. If, for each real number 
x, we let 

A, ={yeRx<y}, 
then 


A, =R, () Ax =¢@. 
xeR ~ rf) Ax =® 


EXERCISES 


1 According to our definition, a function f from X to Y can have 
only one argument: an element of its domain. What do we mean 
when we speak of a function with two or more arguments? 


2 For each of the following subsets of Z xX Z tell whether or not the 
set is a function from Z to Z. 
(a) {(m,m?)|me Z}. 
(b) {(2m,m)|m e Z}. 
(c) {(m,n)|m,ne Z,|\m| = |n|}. 
(d) {@, 2m +1) |meZ}U {(m- 1, 2m—1) |me Z}. 
3 Show that if X and Y are finite sets, then |Y*| = |Y|!*!. (What 
convention must be made for 0° ?) 
4 Prove part (b) of Theorem 1. 
Complete the proof of Theorem 2. 


NN Nn 


Give examples of pairs of functions f and g with fog defined such 
that 


(a) fog is surjective but f is not surjective. 
(b) fog is injective but g is not injective. 

7 How many permutations of the empty set are there? 
Show that the empty set is finite by our definition. 


9 Prove that if X is a finite set, then the number of permutations of 
X is |X|!. 


10 Using our definition of finiteness, prove 
(a) If X is finite and f:X—~Y is a bijection, then Y is finite. 
(b) Any subset of a finite set is finite. 
11 Let X be a finite set and let fg be maps of X into itself. Suppose 


26 


12 


13 


14 


15 


16 


17 


18 


19 


20 


21 


22 


SETS 


fog is the identity permutation of X. Show that f and g are per- 
mutations of X and g = f7!. Give an example that shows that the 
finiteness of X is necessary. 


Let f:X—~Y and suppose that A and B are subsets of Y. Show that 
(A OB)f> = (Aft) Bf*). 


(Compare Exercise 2.5c.) 

Let X and Y be sets, let f:X—~Y, and let E be the set of elements 
(x,;, X2.) in X X X such that x,f=x,f. Show that £ is an equiv- 
alence relation on X. 


What would be a reasonable definition of a ternary operation on 
a set X? 
Show that the intersection of any nonempty family of equivalence 
relations on a set X is again an equivalence relation on X. 
Let E be a family of equivalence relations on a set X indexed by 
N. Suppose £, C £,+ 1 for all 7 in N. Show that 

U E,, 


neN 
is an equivalence relation on X. 


Prove De Morgan’s Laws. That is, show that if A is a family of sets 
indexed by / and B is a set, then 


(a) BNU A;=U (@NA)). 
iel iel 
(b) BU() A; =f) (BUA). 
lel lel 
(c) B— U 4; =f) @ —- Aj). 
tel lel 
(d) B—(\ 4;= U (B-4A)). 
lel lel 
Sketch a proof of Theorem 6, making clear what facts about the 


integers are used. 


Let x and Y be the characteristic vectors for two subsets A and 
B of iN. What are the characteristic vectors for A 1 B, A —B, 
and the complement of A in 14? 


Let A be a vector listing the elements of a subset A of 1. Write 
an APL expression for the characteristic vector X of A. Suppose 
X is given. How can we obtain a list of the elements of A? 


Show that the characteristic matrix Ff of a relation R from wW 
to i is the characteristic function of R considered as a subset 
of the Cartesian product of 1M and 1N. 


Let R be the characteristic matrix of a relation R from 1 to WM 


23 


24 


25 


26 


27 


FUNCTIONS 27 


Write an APL proposition equivalent to the assertion that R is a 
function from 1 to 1. 


We have two methods available for representing a function f from 
14% to 1N by an APL array. We can use either the vector # such 
that FLI J is the image of I under f or the characteristic matrix 
R of f. Write APL expressions that construct # from F and F from R. 


Let c’ be a vector describing a function f from 1 to 1. Write 
APL propositions corresponding to the assertions that f is 1-1 
and that f is onto. Let A bea vector listing the elements of a subset 
A of 1. What is the characteristic vector of Af; !? 


Let # be a vector. Write an APL proposition equivalent to the 
assertion that / is a permutation of 10 F. 


Assume that # is a permutation of 19F. Write an APL expression 
for the inverse of /. 


Let B be an J-by-W matrix. Write an APL proposition corresponding 
to the statement that 2 is a binary operation on 1/.. 


For Exercises 28 and 29, assume that B is a binary operation on 14. 


28 


29 


30 


31 


32 


Write APL propositions for the following assertions. 
(a) Bis commutative. 
(b) B is associative. 


A left identity element for a binary operation @ on a set X is an 
element e of X such that e e x =x for all x in X. Write an APL 
proposition asserting that J is a left identity element for B. Write 
an APL expression for the characteristic vector of the set of left 
identity elements for B. 


Let A,, .-..+.,Am be subsets of 1 and let A be the matrix such 
that ALZ;]is the characteristic vector for A 7. What are the char- 
acteristic vectors of A,U...UA, andA,;N...NA,? 


Let FR be the characteristic matrix for a relation R from 1” to 
iV. Let A be a vector listing the elements of a subset A’ of 1¥. 
Write an APL expression for the characteristic vector of the set 
AR. 


Let A,,...,A, and B,,...,B; be two sequences of subsets of 
iV and let X and Y be the matrices of characteristic vectors for 
these two sequences. Write APL expressions for the matrices 
whose i,j th entries are 

(a) 1A; NB; 1 

(b) 1A; UB |. 


28 SETS 


(c) 1 if A; C B; and O otherwise. 
(d) 1 if A; = B; and O otherwise. 
(e) 1 if A; N.A; # Band O otherwise. 


33. Let X be the characteristic vector for the subset of 1W listed in 
the vector A. Suppose P is a permutation of 1. Write an APL 
expression involving only X and P for the characteristic vector 
of the set listed in PLA]. 


34 Write an APL expression for a (2*W)-by-N matrix whose rows 
list the characteristic vectors of all subsets of 14. 


35 Let e be an associative binary operation on a set X. If x e X and 
n is a positive integer, then x” is defined to be the product x e 
xe...ex withn factors. In order to compute x”, it is not always 
necessary to calculate nm — 1 products. For example, x® can be 
computed by forming x? =xex,x4 =x? ex?,andx® =x4e x4, 
which requires only three products. Suppose that, in order to 
evaluate a product x @ y, a computer program must be run at a 
cost of $.50. What is the minimum expenditure required to com- 
pute x!?7 fora given element x of X? 


*4. SETS OF SETS USING APL 


In the previous sections of this chapter we have discussed ways of repre- 
senting subsets of 1¥ by APL arrays and techniques for performing set- 
theoretic operations with these representations. Working with a small number 
of sets presents few challenges, either theoretical or computational. How- 
ever, the study of certain types of large families of sets suggests many 
interesting problems, some of which can lead to extensive machine com- 
putation. Although more properly a part of the branch of mathematics 
known as combinatorial theory, several problems of this type will be de- 
scribed briefly in the next section. In this section we will investigate some 
methods for answering various types of questions about moderately large 
families of subsets of 1W using APL. These techniques will be useful not 
only in Section 5 but also in Section 3.9. The reader is assumed to be fa- 
miliar with inner products of vectors and matrices as described in Section 
Al.7. 

As noted in Section 3, characteristic vectors are usually more con- 
venient than lists of elements for set-theoretic calculations with subsets 
of 1¥ when JN is not too large. The characteristic vector X for the set listed 
in the vector Ais (14) €A. 


SETS OF SETS USING APL 29 


A+3 8 2 6 83 
O<-X<+(19)eEA 
O1%1200%101 0 


Given X, we can easily reconstruct a list of the elements. 


O+B<X/19 
2 3 6 8 


Note that A and B are not the same vector. We can also use the procedure 
SCHYV to construct characteristic vectors. 


9 SCHV A 
O1100%1 01 0 


Let us now consider some simple questions about two families 
{A,,...,Ags} and {B,,...,B4} of subsets of 19, where A; is the set 
listed in the ith row of 


L+A<+5 307 5 243569 6146 8 8 2 8 


OrRFO +-wN 
MOM WoOWwW WO 
co @Mm WM NO 


and B; is the set listed in the jth row of 


O+B<+4 407 2869182889689 «3273 «4 2 


7 2 8 6 
918 9 
6 3 9 3 
7 3 4 2 


How can we construct APL expressions that answer the following? 
1. Is there an A; that is a subset of some B;? 
2. ArethesetsA,,A,,A3,Aq,and A, distinct? 
3. What are the numbers IA; M B;/? 


To get an idea about how to form the right expressions, let us look 
at the case of just two sets. Suppose P and Q are the characteristic vectors 
for two subsets P and Q of 1. Then P C Q if and only if the APL proposi- 
tion A/P<Q is true. This proposition can be rewritten as PA,<Q The ex- 
ample 


verifies that {1, 4, 6} is not a subset of {1, 2, 5, 6}. The sets P and Q are 
equal if and only if \/ P=Q or, equivalently, if and only if PA. =Q is true. 


PA.=@Q 
0 


Finally, the characteristic vector for PM Q is PAQ, and so IPN Qlis+/PAQ 
or Pt , AQ. 


P+.AQ 

2 
Now let 

[]+X<+9 SCHV A (+¥<+9 SCHV B 
0141003121010 0 010004111 0 
001141200 0 0 10000001 1 
0000010 0 1 001200100 1 
10000101 0 0131120010 0 
01000001 0 


To answer question 1 we form the matrix U such that ULI;.7] is 
XCI3JjJa.sYlJ3J, 
which is 1 if and only if A; C B,. By the definition of the inner product, 
Uis XA, S&Y, 
U<XA.<SQY 
Our answer to question 1 will be yes provided some entry of U is 1. From 


+/,U 
2 


we see that there are two pairs (i,j) such that A; C B;. 
To answer question 2, we want the matrix V such that VLI;J] 1s 
XCI:;Ja.=XL 7; J]. This is the matrix XA. =QX. 


V<XA.=QX 


Since A; =A; for each i, the five diagonal entries of V are all 1. The sets 
A; are distinct provided V has no other entries equal to 1. 


SETS OF SETS USING APL 31 


+/.V 
5 


Since V has just 5 nonzero entries, the A; are distinct. 
Finally, to answer question 3, we must determine the matrix W such 
that WLiIT;J7JisXCZI;]+.aAYL2; J]. This means that Wis X+. AY. 


O<W<+X+ .AQY 


200 2 
001 2 
1 12 0 
22 1 0 
2101 


Here we see, for example, that |A, NB; |= WL4;3]=1. 

Sometimes we want to work with the set of all subsets of 14. The 
characteristic vector of a subset of 1¥ is a logical vector of length W that 
we can consider as the vector of digits of an integer in binary or base 2 
notation. (See Section A1.8.) The characteristic vectors of all 2 subsets 
of 1 are simply the N-digit binary representations of the numbers 


0, l, ee ey 2N — 1. 
The matrix Z<&(Np2)7T12*W lists these vectors. The order in which the 
vectors are listed depends on the origin. The order is more natural in origin 0. 


UTO0<0 D<+Z<Q(N02)T12*N 
N<3 


RFPrRrRR OOO O 
FRO OrRrF OC O 
FF OrRFOrR OF O 


Given a logical vector RF of length WN, we can find the integer J such that FA 
is ZLI3;j] by [<214f. 


R«1 01 ZCI; 4 
O<I<2LR 101 


It is often useful to be able to select one representative from each set 
in a given family of sets. If U is a matrix whose rows are characteristic vectors 


32 


SETS 


for nonempty subsets of 1V, then V<«SFEL U is the vector whose [th com- 
ponent is the first element in ULI; ]/1N. Using the matrix X previously 
defined, we get 


ORO OC O 
re OO OF 


OOO FO 


XxX ULO<+1 
01010 0 SFEL X 
1100 0 0 23 61 2 
00100 1 
00101 0 
00001 0 


It is possible to describe SFEL U by a single APL expression involving an 
inner product. 


EXERCISES 


] 


Let Z be a logical matrix with WV columns. Write an APL proposi- 
tion corresponding to the assertion that the sets whose charac- 
teristic vectors are the rows of Z form a partition of 1M. 


Let # be the characteristic matrix for an equivalence relation 

E on 1M. Write an APL expression for a vector that gives one 

representative from each equivalence class of EF. Procedures in 

CLASSLIB may be used in the expression. 

This problem deals with the sets {A,,...,A,}and {B,,...,Ba} 

defined in the text. Write APL expressions that answer the fol- 

lowing questions. 

(a) For how many pairs (i,j) is A; NB; =D? 

(b) What is the matrix of integers |A; U B;|? 

(c) Which two-element set is contained in the largest number of 
B;? 

Let Z be a matrix all of whose entries are nonnegative integers 

less than WV and let F<«N1L&Z. Show that the matrices ZA .=Z and 

Fo,=F are the same. Which construction takes less central process- 

ing unit (CPU) time? 

Let Z be a logical matrix such that \/v/Z is 1. Write an expression 

for SFEL Z using only primitive operations. (Hint. One way to 

solve the problem involves the inner product | . x.) 

Suppose the logical matrices Pi and P2 list the characteristic 

vectors of the blocks of two partitions I], and Il, of 1. Write 

an APL proposition corresponding to the assertion that I], is are- 

finement of II,. (See Exercise 1.7.) 


BLOCK DESIGNS AND GRAPHS 33 


*5. BLOCK DESIGNS AND GRAPHS 


In this section we will apply some of the techniques discussed in Section 4 
to the study of two types of families of sets that are of particular interest. 
The first type is illustrated by the array DESIGN1 in EXAMPLES, which lists 
seven three-element subsets of 17 with origin 1. For simplicity, let us re- 
name this array D. 


OIO<+1 
O<D<+DESIGN1 


WWNNE BE 
NFOFO FR 
MN Y)I~) Oo ~)I OM Ww 


It is not hard to verify that the sets listed in D satisfy the following prop- 
erties. 

1. Each set has three elements. 

2. Each element of 17 is contained in exactly three sets. 

3. Each two-element subset of 17 is contained in exactly one set. 
We say that the rows of D form a block design with parameters 7, 7, 3, 
3,1. 

In general, a block design with parameters v, b, k, r, \isa pair (X, B), 
where X is a finite set and B is a family of subsets of X such that 


1. IXl=py, 

2. IBl=b. 

3. If Bisin B, then IBI=k. 

4. Ifx isin X, then x is contained in exactly r elements of B. 

5. Each two-element subset of X is contained in exactly A elements 


of B. 
The elements of B are called the blocks of the design. 
Let D be the family of sets listed in D. Using the methods of Section 
4, we can easily check that (17, D) is a block design. To make sure that 
the elements of D are subsets of 17, we compute 


A/,DéEe1r17 
1 


Obviously, 17 has seven elements, and so vy = 7. The remaining calculations 
are done using the matrix of characteristic vectors. 


34 SETS 


[<+X<7 SCHV D 
0 0 0 


OCOCOREBE,E 
CORROOE,, 
PrROoOCOOrR, 
OrFROROP 
FOoOorFROOrR, 
FOORROO 
OrFPPRPORO 


To check that |D|= 7, we need to see that the rows of X are distinct. 


+/,XN.=QX 
7 


Since XA ,=QX has just seven entries equal to 1, we do have b = 7. The 7th 
component of +/X is the cardinality of the Ith element of D. From 


A/3=+/X 
1 


we see that all elements of D have k = 3 elements. The Jth component of 
+#X is the number of elements of D containing J. Since 


A/3=+7#X 
1 


it follows that every element of 17 isin r = 3 elements of D. 
The fifth axiom is a little more complicated to verify. We first form 


Y¥<7 SCHV 2 SSUB 7 
oY 
21 7 


The Jth row of Y is the characteristic vector for the Ith two-element subset 
of 17. If 


M<YAN.<SQX 


then MLI;J7] is 1 if the Ith two-element set is contained in the /th element 
of D and 0 otherwise. Therefore the Jth component of 


I<+/M 
gives the number of elements of D containing the th two-element set. From 


A/1=L 


BLOCK DESIGNS AND GRAPHS 35 


we see that every two-element set is contained in A = 1 elements of D. Thus 
(17, D) is a block design with parameters vy = 7,b =7,k =3,r=3,A=1. 

One type of block design that has been studied extensively is the 
projective plane. A projective plane is a block design whose parameters 
are of the form vy = b=q* +q+1,k=r=q+1,A=1, where gq is an integer 
greater than 1. The integer g is called the order of the plane. The design 
(17,0) is a projective plane of order 2. The order of every known projec- 
tive plane is a power of a prime. No projective plane of order 6 exists, 
but it is still an unsolved question whether or not there is a projective plane 
of order 10. A great deal of effort has been devoted to this problem, in- 
cluding a very large amount of machine computation. 

The concept of a graph provides another class of examples of inter- 
esting families of sets. An undirected graph (or, simply, a graph) is a pair 
(X, FE), where X is a set and E is a set of two-element subsets of X. The 
elements of X are called the vertices of the graph and the elements of E 
are called the edges. One method of describing a graph is to choose one 
point in the plane for each vertex and connect two of these points by 
a line segment if the corresponding pair of vertices is an edge. For example, 
if X is 15 in origin O and 


E= {{0, 1}, {0, 2}, {1, 2}, (1, 33}, 
then one possible diagram for the graph G = CX, E) would be 


0 


3 e4 


If {x, y} is an edge of G, we say x is connected to y by an edge. Note in our 
example that 4 is not connected to any other vertex by an edge. One way 
to describe G by an APL array is to list the elements of EF in a matrix such as 


[+E<4 290102121 3 


kr FO O 
W NN FE 


Of course, we must also remember that X is 15 in order to determine G 
completely. 


36 SETS 


The number of graphs with the vertex set 1V is 2* 2!VN, since there are 
2!Ntwo-element subsets of 1 and the number of subsets of a set with 
elements is 2*M@. However, many of these graphs are so similar that they 
are not usually considered to be really distinct. For example, the graph 
H given by the diagram 


3 4 


has the edge set {{0, 2}, {2, 3}, {2, 4}, {3, 43} and thus is not the same 
graph as our example G. However, the diagrams for both consist of “‘a 
triangle with a tail and one extra point’’, and the second can be obtained 
from the first by renumbering the points in the diagram. We say that these 
two graphs are isomorphic. 

More formally, we say that a graph (X,, E,) is isomorphic to a graph 
(X,, E,) if there is a bijection f:X,—» X, such that the edges in F, are 
precisely the sets {xf, vf}= {x, y}f, where {x, y} isan edge in E, or, equiv- 
alently, such that the map {x, y} > {x, y}f is a bijection of E, onto E,. 
The map fis referred to as an isomorphism of (X,, E,) onto (X,, F,). If 
X, and X, are both 1M, then fis a permutation of 1M. If we set 


F<«4 2 3 0 1 UIO<0 
then the array 
FLE] 


FF 
Ow W ho 


NO 


lists the edges of 4. Since £F lists the edges of G, we see that F is an iso- 
morphism of Gonto H. 

Near the end of Chapter 3 we will return to the study of graphs and 
determine the number of essentially distinct, that is, nonisomorphic, graphs 
with n vertices, where 1 < 5. As an introduction to the techniques that 
will be used, let us consider several ways of representing a graph whose 


vertex set is 1. We will use the preceding graph G = (15,2) as an illustra- 
tion. 


BLOCK DESIGNS AND GRAPHS 37 
Instead of listing the edges of G in the matrix — we could form the 


S-by-5 logical matrix R in which RLI;J7Jis 1 if and only if I is connected 
to J by an edge of G. For our example we have 


OU<+R«5 50011002701%21201212000020000000~0 


O11 0 0 
1011 0 
110 0 0 
O10 0 0 
000 0 0 


We call FR the adjacency matrix for G. 
Another approach is first to form 


U<«P<2 SSUB 5 


WNPRPRONRORKACO 
FFF WWWNNEB 


ig 


and then to specify a graph with vertex set 15 by describing which rows of 
P are edges. Thus the edges of G are the rows of P whose indices are listed 
in the vector 


V+O 12 4 


If we form the corresponding characteristic vector, 


O<+Z<10 SCHV V 
11%210%100 0 0 0 


then we can list the edges of Gin two ways. 


PLV;] Z¢P 


a ome) 
WNNE 
ome) 
ONNE 


Finally, we can assign to G the integer 


38 SETS 


L]<+M<21Z 
928 


which uniquely determines G. In this way we get a numbering of the graphs 
with vertex set 15 from 0 to 1023. To see what the 364th graph in this 
numbering is, we compute 


O<W+(1002)T364 
010%1%10%41%10 0 
WAP 


ke OF O OO 
FFWWhN 


Thus the 364th graph is given by the diagram 


3 


4 


It should be emphasized that this numbering is relative to a fixed list of the 
two-element subsets of 15, in this case given by P. A similar numbering of 
the graphs with vertex set 1 for any W can be constructed using the matrix 
2 SSUB N. 


EXERCISES 


1 The workspace EXAMPLES contains arrays DESIGN2, DESIGNS, 
DESIGN4, and DESIGN 5. Show that each of these arrays defines a 
block design and determine the parameters in each case. 

2 Show that the parameters of a block design (X, 8) satisfy the 
equations rv = kb and A(v — 1) = r({k — 1). [Hint. For the first 
equation, count the number of pairs (x, B), where x is in X and B 
is a block in B containing x. ] 

3 Sketch a diagram for each of the following graphs CX, £). 

(a) X= {0, 1, 2, 3}, £ = {{0, 2}, {1, 3}, (2, 3}}. 
(b) X = {0;, 1, 2, 3, 4}, E= {{0, 2}, {0, 3}, {1, 3}, (1, 4), (2, 4}. 


10 


BLOCK DESIGNS AND GRAPHS 39 


Let G, H, and K be graphs. Show that 

(a) Gis isomorphic to G. 

(b) If G is isomorphic to ti, then H is isomorphic to G. 

(c) If G is isomorphic to H and H is isomorphic to K, then G is 
isomorphic to K. 

Suppose the edges of a graph G with vertex set 1N are listed in 

the rows of a matrix EF. Show how to construct the adjacency 

matrix R of G from W and £ using one or more APL statements. 

(Hint. First form C<W SCHV E.) 

Construct the adjacency matrices for the following graphs. 


(a) 0 1 


1) 
NO 


(b) 


oO 


QY 
LP 


3 2 


Let K be the graph in Exercise 6a. How many isomorphisms of K 
onto itself are there? 

What is the number of the graph in Exercise 6a in the numbering 
of the graphs with vertex set 14 basedonthe matrix 2 SSUB 4? 
What is the 18th graph in this numbering? 

Let (X, £) bea graph and let f:X—~Y be a bijection. Define F = 
{{xf, vf} |{x, vy} e E}. Show that (Y, F) is a graph isomorphic 
to (X, E). 

Let X be a set and let Z be the set of two-element subsets of X. 


40 


SETS 


11 


12 


*13 


Suppose F and E’ are subsets of Z. Prove that the graphs (X, £) 
and (X, E') are isomorphic if and only if (X, Z — E) and (X, Z — E’) 
are isomorphic. 

Give an appropriate definition for one block design (X,, B, ) to be 
isomorphic to another block design (X,, B,). Prove that isomorphic 
designs have the same parameters. 

Show that any projective plane of order 2 is isomorphic to the de- 
sign described by the matrix DESIGN1. 

Show that the arrays DESIGN4 and DESIGN5 describe block de- 
signs that are not isomorphic. 


THE INTEGERS 


In this chapter we will develop some important properties of the integers. 
Integers are encountered frequently in many branches of mathematics, but 
this is not the only reason they are of interest to us. As we pursue the study 
of algebra, we will repeatedly come across concepts whose significance was 
first noted in the context of the integers. Later these concepts were seen to 
be useful in much more general situations. Since readers are expected to have 
a basic familiarity with integers, we will proceed in an informal manner, 
much as we did in our review of sets in Chapter 1. Throughout this chapter 
the APL index origin will be assumed to be 1 unless there is an explicit 
statement to the contrary. 


1. DIVISIBILITY 


Space does not permit listing all the facts about the integers that we will 
assume without proof. However, there are two theorems that are so im- 
portant that they need to be mentioned. 


THEOREM 1. Every nonempty set of positive integers has a smallest 
element. 


THEOREM 2. Let m and n be integers with n # 0. There exist unique 
integers g andr such that m=qn+rand0<r<ln|. O 


Let X.be a nonempty set of positive integers. According to Theorem 
1, there is an element x in X such that x <y for all y in X. It is this property 
that makes it possible to prove theorems by mathematical induction. 

In Theorem 2 the integers g and ¢ are called the integral quotient and the 
remainder, respectively, when m is divided by n. Integral quotients and 
remainders are easily computed in APL. The remainder R when M is divided 
by WV is(|NV)|M.(Why not V|M?) The integral quotient Q is(“¥-F# ) +N. For 
convenience, CLASSLIB contains procedures ZQUOT and ZREM, which can 
be used to calculate integral quotients and remainders, respectively. 


4] 


42 THE INTEGERS 


M<+12 N ZREM M 

W< 5 9 

(|NV)|mu M ZQUOT WN 
2 2 

(M-2)#N 


2 


Note that the order of the arguments for ZQUOT and ZREM™ is analogous to 
that for + and |, respectively. The procedures ZQUOT and ZREM are ex- 
tended to nonscalar arguments in the same entry-by-entry manner as are 
the primitive dyadic scalar operations. 


25 56 ZQUOT 7 11 
3.5 

(16) ZREM 31 
011311 


An integer a is said to divide or be a divisor of another integer c if 
there is a third integer b such that c = ab. In this case we also say that 
c is a multiple of a. Every integer divides 0, but the only integer divisible 
by O is O itself. The APL proposition corresponding to the assertion that 
A divides C is 0=A|C.In traditional notation the statements “a divides 
c’’ and “a does not divide c”’ are abbreviated alc and afc, respectively. Be- 
cause of the possible confusion with the APL remainder operation, we will 
not use the symbol | to denote “‘divides”’ in this book. 

The following theorem summarizes some important facts about divisors. 

THEOREM 3. The relation “is a divisor of’? on Z is reflexive and 
transitive. If x and y are in Z, then 

(a) x divides y and y divides x if and only if |x| = l[y|. 

(b) If x divides y and y #0, then |x| < ly|. 

Proof. The reader should already be familiar with these facts. Their 
proofs are left as exercises. 


EXERCISES 


1 Let Vbea positive integer. Write APL expressions for 
(a) The characteristic vector for the set of positive divisors of 
N, considered as a subset of 1¥. 
(b) The list of positive divisors of JW. 
(c) The number of positive divisors of NV. 
2 Write an APL expression for a vector D of length W such that DLT] 
is the number of positive divisors of J. 


GREATEST COMMON DIVISORS 43 


3. Prove Theorem 3. 


4 Does every nonempty set of positive rational numbers have a small- 
est element? 

5 Whenever a division is performed on an APL terminal system and 
the result is known to be an integer, it is a good practice to apply 
the floor operation L to the quotient. Explain why. 


2. GREATEST COMMON DIVISORS 


Let a and b be integers. A common divisor of a and b is an integer c such 
that c divides both a and b. For example, 4 is a common divisor of 12 and 
—20 and 1 is a common divisor of every pair of integers. A greatest common 
divisor of a and b is an integer d such that 


1 d20. 
2 disacommon divisor of a and b. 
3. dis divisible by every common divisor of a and b. 


The set of common divisors of 10 and 12 is D = {-—2, —1, 1, 2}, and 2isa 
nonnegative element of D that is divisible by every element in D. Thus 2 
is a greatest common divisor of 10 and 12. The goal of this section is to 
prove that greatest common divisors exist and are unique and to present an 
efficient method for calculating greatest common divisors. 


THEOREM 1. Any two integers a and b have at most one greatest 
common divisor. 


Proof. Suppose d and e are both greatest common divisors of a and 
b. Then d is divisible by every common divisor of a and b and, in particu- 
lar, d is divisible by e. By exactly the same argument, e is divisible by d. 
Since d and e are each nonnegative, we must haved=e. U 

In order to show that a greatest common divisor of a and b always 
exists, we consider the set S(a, b) of all integers of the form ra + sb, with 


r,s in Z. As the next theorem shows, S(a, b) cannot be an arbitrary subset 
of Z. 


THEOREM 2. If x and y are in S(a, b), then x + y and x — y are also 
in S(a, b). 


Proof. This follows immediately from the observation that 
(r,a+5,b) + (r,a+5,b)=(r, t7r,)at+(s, t5,)b. O 


A nonempty subset M of Z that contains the sum and difference of 
any two of its elements is called an additive subgroup of Z. (The general 
definition of the term ‘‘subgroup”’ will be given in Chapter 3.) Since S(a, b) 


44 THE INTEGERS 


is Obviously nonempty, Theorem 2 tells us that S(a, b) is an additive sub- 
group of Z. If n is any integer, then the set nZ of all multiples of nis S(n, 0) 
and so is an additive subgroup of Z. If n is contained in an additive sub- 
group M of Z, then M contains n +n = 2n, 2n +n = 3n,...,andn—n= 
0,0 —n = —n, —n—n = —2n,.... Thus M contains nZ. We will now show 
that the sets nZ are the only additive subgroups of Z. 


THEOREM 3. Let M be an additive subgroup of Z. Then there exists a 
unique integer m = 0 such that M =nZ. 


Proof. Since M # Q, there is an element x in M and M contains x — x = 
0. If M = {0}, then M = OZ. Thus we may assume M + {0} and therefore 
M contains an element y #0. Since both y and —y are in M, M contains posi- 
tive elements. By Theorem 1.1, M contains a smallest positive element n. 
Let m be any element of M. By Theorem 1.2, we can find integers g and 
r such that m = qn +rand0O <r<n. But m and qn are both in M, sor = 
m — qn is in M. By the choice of n, this forces r to be 0. Therefore m = qn 
is in nZ. This shows that M C nZ. However, the inclusion M 5 nZ was noted 
previously, and so M = nZ. We still have to establish the uniqueness of n. 
Suppose mZ = nZ for some nonnegative integers m and n. Then m and n 
must each be a divisor of the other, and this implies that m =n. O 


The integer n of Theorem 3 is called the nonnegative generator of M. 


THEOREM 4. Let a and b be integers. Then a and b have a greatest 
common divisor d, d is unique, and d can be written in the form ra + sb, 
where r, s are in Z. 


Proof. The uniqueness of d was shown in Theorem 1. By Theorems 2 
and 3, there is a nonnegative integer d in S(a, b) such that S(, b) = dZ. 
Since d is in S(@, b), there exist r and s in Z such that d = ra + sb. Therefore 
any common divisor of a and b divides d. Now a = la + Ob and b = 0a + 1b 
are both in S@, b), and every element of S(@, b) is divisible by d. Thus 
d is a common divisor of a and b. Since d = 0, we see that d is a greatest 
common divisor ofaandb. Q 


Having proved Theorem 4, we may speak of the greatest common 
divisor of a and b, which we denote by gcd(a,b). Although we know gcd(a, b) 
exists, we do not yet have an efficient method for actually computing 
gcd(a, b). The following theorem and its corollary provide the basis for one 
such algorithm. 

THEOREM 5S. If a,b, and gq are integers, then 

S(a,b) = S(b,a) = S(a, —b) = S@ b+qa). 
Also, S(O, b) = |b|Z. 


Proof. We will prove only that S@, b) = S@, b + qa), leaving the rest 
as an exercise. For any r,s in Z, we have 


GREATEST COMMON DIVISORS 45 


ra + s(b + qa) = (r + Sq)a + sb 


and so S(a, b + qa) C S(a, b). If we let c = b + qa, the same argument shows 
that S(@, c — ga) C S(a,c). But c — ga = b and, hence, S(, b) C S(a, b + qa). 
Therefore S(a, b) = S(a,b +qa). OU 
COROLLARY 6. Ifa, b, and g are integers, then 
gcd(a, b) = gcd(b, a) = gcd(a, —b) = gcd(a, b + ga). 
Also, gcd(O, b) = |p|. 


Proof. Since gcd(a, b) is the nonnegative generator of S(a, b), the 
corollary follows at once from Theorem 5. U0 


Suppose we are given integers a and b and we want to compute 
gcd(a, b). By Corollary 6, we may assume 0 <a< b. If a=0, then gcd(a, b) = 
b. If a #0, then we may write b = ga + r, where q is the integral quotient 
of b by a and +r is the remainder. Since r = b — ga, Corollary 6 tells us that 
gcd(a, b) = gcd(a, r) = gcd(r, a). If r = O, then gcd(a, b) = a. Otherwise we 
may repeat the process, dividing r into a to get a new remainder. Since 
0 <r<_a, this procedure cannot continue indefinitely. Eventually, a re- 
mainder of O is reached, and gcd(a, b) is the last nonzero remainder. As an 
example, let us compute gcd(493, 533). 


4931533 13] 40 
40 1 

401493 1} 13 
13 0 


By Corollary 6, 


gcd(493, 533) = gcd(40, 493) = gcd(13, 40) 

= gcd(1, 13) = gcd(O, 1) = 1. 
The recursive procedure we have described for calculating gcd(a, b) is 
called the Euclidean algorithm after the Greek geometer Euclid who flour- 


ished around 300 B.C. The procedure ZGCD in CLASSLTIB is based on this 
algorithm. 


493 ZGCD 533 
1 


Using ZGCD with nonscalar arguments gives the entry-by-entry greatest 
common divisor. One-entry arrays are expanded to match the other argu- 
ment just as with the primitive scalar dyadic operations. 


12 14 ZGCD 15 27 
3 1 

6 ZGCD 16 
123 21 6 


46 THE INTEGERS 


Being able to compute greatest common divisors allows us to decide 
the existence of integer solutions of a single linear equation with integer 
coefficients. 


THEOREM 7. Let a, b, and c be integers. There exist integers x and 
y satisfying the equation ax + by =c if and only if gcd(a, b) divides c. 

Proof. Suppose x and y are integers such that ax + by =c. Then c is 
in S(a, b) = dZ, where d = gcd(a, b). Therefore d divides c. On the other 
hand, suppose d divides c, so that c = md for some m in Z. We know there 
exist integers r and s such that d = ra + sb. Multiplying by m gives c = md = 
(mr)a + (ms)b. Thus x =rm and y = sm satisfy ax tby=c. O 


As an example, let us consider the equation 493x + 533y = 2. We 
know already that gcd(493, 533) = 1, which divides 2, and so integer solu- 
tions to this equation exist by Theorem 7. The proof of Theorem 7 even 
shows us how to find a solution provided we can determine integers r and s 
such that | = 493r + 533s. The Euclidean algorithm can be extended to 
produce one pair r, s. We first compute the integral quotients corresponding 
to the remainders we found in calculating gcd(493, 533). 


533 ZQUOT 493 40 ZQUO0OT 13 
1 3 

493 ZQUOT 40 
12 
Thus 


40 = 533 —1 X 493, 

13 = 493 — 12 x 40, 
= 40-3 xX 13. 

Working backward, we find 
1=40 —-3 X13 

= 40 — 3(493 — 12 X 40) 
= 37 X 40 —3 X 493 
= 37(533 — 1 X 493) —3 X 493 
= (—40) xX 493 + 37 X 533. 


Therefore r = —40 and s = 37 gives one solution to 493r + 533s = 1, and 
x = 2r = —80 and y = 2s = 74 is a solution to the original equation 493x + 
533y = 2. 

The procedure ZGCD computes r and s as the global variables Ff and 5S. 


493 ZGCD 533 S 
1 37 

R (493xR)+533xS 
~4O 1 


GREATEST COMMON DIVISORS 47 


Thus, to find a solution to the equation 2497x — 3872y = 33, we may 
proceed as follows. 


2497 ZGCD 3872 


11 

ft 
107 

2 
69 

33711 
3 

O<X<3xR 
321 

O<¥<3xS 
207 

(2497xX )-3872xY 
33 


Let a and b be integers. We say that a and 5 are relatively prime if 
gcd(a, b) = 1. By Theorem 4, this is equivalent to requiring that there exist 
integers r and s such that ra + sb = 1. 


THEOREM 8. Let a and b be integers and let d = gcd(a, b). Then 
a/d and b/d are relatively prime. (Here we adopt the APL convention that 
0:0is 1 for the casea =b = 0.) 

Proof. There exist integers r and s such that d = ra + sb. If d = 0, then 
a=b=0O and it is true that gcd(0/0, 0/0) = gcd(1, 1)= 1. If d #0, then 


$= DO 


and so gcd(a/d, b/d)=1. OU 


THEOREM 9. Suppose a and Db are relatively prime integers. If c is 
an integer and a divides bc, then a divides c. 


Proof. Choose integers r and s such that ra + sb = 1. Then c =cra+csb. 
Since a divides cra and csb,a dividesc. 0 


COROLLARY 10. Let a, b, and c be integers and let d = gcd(a, b). If 
a divides bc, then a/d divides c. 


Proof. If a divides bc, then a/d divides (b/d)c. By Theorem 8, 
gcd(a/d, b/d) = 1 and so, by Theorem 9, a/d dividesc. OU 

So far we have considered only greatest common divisors of two 
integers. Given three integers a, b, and c, we have several choices for de- 
fining gcd(a, b, c). We could define it to be gcd(a, gcd(b, c)) or gcd(gcd(a, 
b), c). However, we could also consider the set S(a, b, c) of integers of the 
form ra + sb + tc, prove that S(a, b, c) is an additive subgroup of Z, and 


48 THE INTEGERS 


define gcd(a, b, c) to be the nonnegative generator of S(a, b, c). In fact, 
all of these definitions are equivalent. (See Exercises 10 and 12.) 

Related to the concept of the greatest common divisor is the notion 
of the least common multiple. A common multiple of two integers a and 
b is an integer m that is divisible by both a and b. We say mis a least com- 
mon multiple of a and b if m is a nonnegative common multiple that divides 
every common multiple. 


THEOREM 11. Every pair of integers a and b has a unique least com- 
mon multiple m. If d = gcd@, b), then md = |ab|. 


Proof. If a = 0 or b = 0, then O is the only common multiple of a and 
b and so m = QO is the unique least common multiple of a and b. Clearly, 
md = |ab| in this case. Thus we may suppose neither a nor b is 0. This means 
that d = gcd(a, b) is also nonzero. Since 


> = a(5) = =(q) 


and both b/d and a/d are intewers, we see that m = |ab//d is a nonnegative 
common multiple of a and b. Suppose v is any common multiple of a and 
b. Thus n = ra = sb for some integers r and s. Since a divides sb, by Corollary 
10 we know that a/d divides s. Thus s = t(a/d) for some ¢ in Z andn = sb = 
tab/d. Thus m divides n and so m is a least common multiple of a and b. 
If m’ is any other least common multiple of a and b, then m and m’ are 
nonnegative integers that divide each other. Therefore m = m' and so m is 
the unique least common multiple of a and b. From the definition of m, we 
have md=|ab|. OU 


The least common multiple of a and b is denoted Icm(a, b). The pro- 
cedure ZLCM in CLASSLIB computes least common multiples. 


12 ZLCM 15 120 ZLCM 105 
60 840 


The result when ZZCM is used with nonscalar arrays is the entry-by-entry 
least common multiple. 


EXERCISES 


1 Show directly from the definition that O is a greatest common 
divisor of 0 and 0. 


2 Let M and N be additive subgroups of Z. Prove that MM Nis also 
an additive subgroup of Z. 


3 Complete the proof of Theorem 5. 


4 Without using ZGCD, compute d = gcd(a, b) by means of the Eu- 
clidean algorithm for each of the following pairs a, b. 


10 


11 


12 


13 


14 


15 


16 


17 


GREATEST COMMON DIVISORS 49 


(a) 12, 39. (c) 4953, 14697. 

(b) 217, 413. (d) 737019, 2055168. 

The Euclidean algorithm reduces the computation of d = gcd(a, b) 
to the computation of gcd(a, c), where c = b — qa and 0 <c < lal. 
Suppose we can express d as r’a + s'c. Show how to write d in the 
form ra + sb. 


For each pair of integers a, b in Exercise 4, write gcd(a, b) in the 
form ra + sb, where r and s are integers. 

For each of the following equations, find one integer solution or 
show that no integer solution exists. 

(a) 7x +5y=9. (c) 91x + 14ly = —27. 

(b) 6x —9y=11. (d) 577x — 828y = 1001. 


The function gcd is a binary operation on Z. Show that gcd is 
commutative and associative. 

Let a and b be integers with O <a <b. Determine an upper bound 
in terms of a for the number of remainders that have to be calcu- 
lated in order to compute gcd(a, b) by the Euclidean algorithm. 
Let a,,..-.,4, be integers. A greatest common divisor of a,,..., 
a, 18 a nonnegative integer d such that d divides each a; and when- 
ever c is an integer dividing each a; then c divides d. Show that d 


is unique. Let S(a,,... ,4,) denote the set of integers of the form 
r,a, +...+7,a,, where each 7; is an integer. Prove that S@,,..., 
a, ) is an additive subgroup of Z and that the nonnegative generator 
d of S(a,,...,4,) is the greatest common divisor of a,,..., py. 
[We will write d = gcd(a,,...,4@y).] 

Let a,,...,4,,0 be integers. Show that the equationa,;x, t...+ 
QnX,n = b has an integer solution if and only if gcd(a,, ...,a,) 
divides b. 


Show that the function gcd defined in Exercise 10 satisfies the con- 
dition gcd(@,,...,4@,) = gcd(a,, gcd(a,,...,4,)). 

Compute gcd(6409, 8177, 13949). 

Find one integer solution to the equation 12x — 15y + 20z = 29. 
Let a and b be integers. Show that M = (aZ) N (bZ) is the set of 
common multiples of a and b. Prove that M = mZ, where m = 
Icm(a, b). 

Calculate the least common multiples of the pairs of integers in 
Exercise 4. 

What should be the definition of lcm(@,, ...,a,)? Compute 
Icm(6409, 8177, 13949). 


50 THE INTEGERS 


18 Let A and B be positive integers. Write an APL expression defining 
a vector listing the positive common divisors of A and B. 

19 Let A be a vector of integers and let M be an integer. Write APL 
propositions for the following statements. 

(a) Mis acommon divisor of the components of A. 
(b) Misacommon multiple of the components of A. 

20. For any integer vector A let S(A) be the set of integers of the 
form R+.xA, where Ff is an integer vector. By Exercise 10, the 
nonnegative generator of S(A) is the greatest common divisor of the 
components of A. What is S(10)? Prove that 
(a) S(A) = S(1A4). 

(b) S(4)=S((4#0) /A). 
(c) If Mis any component of A, then S(A) =S(Y,M|A). 

21 Let A be a nonempty vector with nonzero integer components. 
Suppose the statement 


A<(A#0)/A<M,(M+L/A)|A<|A 


is repeatedly executed. Show that eventually A will have length 
J and, when that happens, the component of A is the greatest 
common divisor of the components in the original vector. 

22 Write a procedure GCDYV based on Exercises 20 and 21 such that 
D+GCDV A makes D the greatest common divisor of the com- 
ponents of A. 


23 Modify your procedure GCDY so that it computes a global vector 
Rsuch that GCDV AisR+,~A. 

24 Write a procedure LCMV such that M<LCMV A defines ™ to be the 
least common multiple of the components of the vector A. 


3. CONGRUENCE 


Let n be a fixed positive integer. We say that an integer a is congruent 
modulo n to another integer b if n divides a — b. If this is the case, we 
write a = b (mod n). Thus 19 = 33 (mod 7) and —8 =7 (mod 5). The in- 
teger 1 is called the modulus of the congruence. 


THEOREM 1. Congruence modulo 7 is an equivalence relation on Z 
with exactly m equivalence classes. The set {0, 1,...,”—l1}isaset of 
representatives for the equivalence classes. 


Proof. Let a, b, c be integers. Clearly, n divides a — a, andsoa=a 
(mod n). If a = b (mod n), then n divides a — b, and so n divides b — a. 
Thus b =a (mod n). If a =b (mod n) and b =c (mod n), then n divides 


CONGRUENCE 51 


both a — b and b — c. Therefore n divides (a — b) + (6 — c) =a — c and 
a =c (mod n). We have now shown that congruence modulo 7 is an equiv- 
alence relation. By Theorem 1.2, every integer m is congruent modulo n 
to an element of X = {0, 1,...,n—1}. Clearly, no two distinct elements 
of X are congruent modulo n, and so X is a set of representatives for the 
equivalence classes. LU 

An equivalence class of the relation congruence modulo n is called a 
congruence class modulo n. The set of all congruence classes modulo 7 is 
denoted Z,. The congruence class containing a particular integer a will be 
written [a],, or simply [a] when the modulus 7 Is clear. 

We will now define two binary operations on Z,, that is, two func- 
tions from Z, X Z, to Z,. These binary operations will be denoted + and 
xX. We would like the equations 


[a] +[b] =lat+b], 
[a] xX [b] =[ax db], 


to hold for all a, b in Z. This suggests that we should define + to be the 
subset 


P= { ({a], [b]), [a+b] |a,b eZ} 


of (Z, X Z,) X Z,. But is Preally a function? Certainly P is a relation from 
Z, X Z, to Z,. What we need to show is that each ordered pair (x, y) in 
Z, X Z, occurs as the first component in exactly one element ((x, y), z) of 
P. Suppose x = [a] and y = [b]. If we take z to be [a + b], then ((x, y), Zz) 
is in P. Suppose ((x, y), w) is also in P. Then we must have x = [c], y = [d], 
and w = [c +d] for some c and d in Z. Now [a] = [c] and [b] = [d], and 
so n divides a — c and b — d. Therefore n divides (a — c) + (b — d) = (a+b) — 
(c +d). Thusa +b =c +d (mod n) and w =z. Thus (x, y) is the first com- 
ponent of exactly one element of P, and P is a binary operation on Z,. By a 
similar argument, it can be shown that 


T= {((la], [6]), [aX b}) la, b eZ} 
is a binary operation X on Z,. Modulo 7 we have 


[2] +[6] =[1], [2] x [6] = [5], 
[4] +[5] = [2], [4] x [5] = [6]. 


Showing, as we did previously, that a given relation is really a function 
is called proving that the function is well defined. Our arguments showing 
that the binary operations + and X on Z,, are well defined can be summa- 
rized in the following theorem. 


THEOREM 2. If a =c (mod n) and b =d (mod n), thenatb=ct 
d(modn)andaX b=cXd(modn). U 


52 THE INTEGERS 


Suppose we are given integers a and b. By a solution to the congruence 
ax =b (mod n) (x) 


we mean an integer x for which the congruence is true. By Theorem 2, if 
x is a solution, then so is every element of the congruence class [x] modulo 
n. Thus, solving the congruence (x) amounts to finding those elements 
[x] of Z, such that [a] X [x] = [b]. As the next theorem shows, the 
existence of solutions to («) depends on gcd(a, n). 


THEOREM 3. The congruence ax = b (mod 7) has solutions if and 
only if d = gcd(a, n) divides b. If a solution exists, then it is unique modulo 
n/d. 

Proof. The integer x satisfies the congruence ax = b (mod n) if and 
only if there is an integer z such that ax + nz = b. By Theorem 2.7, the 
equation ax + nz = b has solutions if and only if d = gcd(a, n) divides b. 
Suppose x satisfies ax = b (mod n). Let y be congruent to x modulo n/d. 
Thus y =x +r(n/d) for some r in Z. Then 

ay =ax + ar() ax + nr(‘4)= ax =b (mod n), 
Therefore ay = b (mod n). Conversely, suppose ax =ay = b (mod n). Then 


a(x — y) =ax —ay =b —b=0 (modn) 
and so n divides a(x — y). By Corollary 2.10, n/d divides x — y and hence 
x =y (mod n/d). Therefore, if solutions exist, then they are unique modulo 


nid. U 


COROLLARY 4. If a, b, and c are integers and ab = ac (mod n), then 
b =c (mod n/d), where d = gcd(a, n). In particular, if a and 7 are relatively 
prime, then bF c(modn). O 

Suppose we wish to solve the congruence 35x = 77 (mod 98). From 


35 ZGCD 98 R 
7 3 
7|77 S 


0 1 


we see that gcd(35, 98) = 7 and that 7 divides 77. By Theorem 3, the con- 
gruence has a solution that is unique modulo 98/7 = 14. The global vari- 
ables R and S satisfy 7=(35xR)+98xS. Multiplying by 77/7 = 11, we find 
that 11xR is one solution. Reducing modulo 14, 


14|/11xR 
5 


we obtain the solution x = 5 (mod 14). Modulo 98, the solutions are 5, 
19, 33, 47, 61, 75, and 89. 


CONGRUENCE 53 


There is another way to solve the congruence 35x = 77 (mod 98) on 
an APL terminal system. If we let X<198, then the [th component of the 
logical vector 77 = 98| 35xXis 1 if and only if J is a solution of the con- 
gruence. Thus we can obtain a list of the solutions by forming 


(77=98|35xX)/X<198 
9 19 33 47 61 75 89 


This method of trying all possible values modulo 98 would not be feasible 
in hand computation nor would it be feasible even at a terminal for a con- 
gruence such as 876543x = 123456 (mod 10°). However, it is useful for 
solving some congruences that are not of the form ax = b (mod 7). For 
example, suppose we wish to find all integers x such that x? + 5x* — 2x + 
6 =0 (mod 110). From 


(0=110|64+Xx 24+Xx54+X)/X*+1110 
31 42 86 97 


we see that the solutions are x = 31, 42, 86, and 97 (mod 110). The APL 
expression used to evaluate the polynomial in this example is based on the 
identity 


x2 + 5x* —2x +6=6+x(—2+x(5 +x)). 


This approach to polynomial evaluation, which reduces the number of 
arithmetic operations involved, is referred to as Horner’s method. 


EXERCISES 


1 Our definition of congruence modulo m makes sense even when 
n =Q. What parts of Theorem | remain true when n = 0? 

2 Let m and n be positive integers. The sets Z,, and Z, are both 
partitions of Z. Show that Z,, is a refinement of Z, if and only 
if n divides m. (See Exercise 1.7.) 

3 Show that the set T defined in this section is a function from 
Z, X LZ, to Z,. 

4 Find all solutions, if any, to the following congruences. 
(a) 12x =11(mod17). (c) 853x =472 (mod 1999). 
(b) 30x =44 (mod 125). (d) 20623x = 30143 (mod 40877). 

5 Solve the congruence x* + x? — 3x? + 2x +4 =0 (mod 455). 

6 Let A, B, and NV be integers with V > 0. Write an APL proposition 
corresponding to the statement that A is congruent to B modulo 
NV. 

7 Let A, M@ and NV be integers with V > O. Write an APL expression 


54. THE INTEGERS 


for the element of the set M+1WN that is congruent to 4 modulo 
WV. (Assume J) [0+1 .) 


8 Any solution of the congruence ax = b (mod n) is also a solution 
of ax = b (mod m) for any positive divisor m of n. We may use 
this fact to solve the congruence 876543x = 123456 (mod 10°) 
by a modified trial-and-error process. Try all elements of 1100 
to find the solution modulo 102. Each solution modulo 10? can 
yield up to 100 solutions modulo 10*. Construct a vector of 
reasonable length that must contain every solution modulo 10* and 
test to find the solution modulo 10*. Repeat the process to find all 
solutions modulo 10°. 


9 Use the ideas. of Exercise 8 to solve the congruence x” + 786448x + 
128767 =0 (mod 10°). 


4. PRIMES 


A prime number or prime is a positive integer with exactly two positive divi- 
sors. Thus | is not a prime because it has only one positive divisor and 
4, 6, 8, 9, and 10 are not primes because they each have at least three 
positive divisors. The first five primes are 2, 3, 5, 7, and 11. Since | and p 
are positive divisors of any positive integer p, we could also define a prime 
to be an integer greater than 1 with no positive divisors other than | and 
itself. An integer greater than 1 that is not a prime is said to be composite. 


THEOREM |. Let p be a prime and let a and b be integers. If p divides 
ab, then either p divides a or p divides b. 


Proof. Let d = gcd(a, p). Then d is a positive divisor of p and so is | or 
p. If d = p, then p divides a. If d = 1, then p divides b by Theorem 2.9. ( 


It follows from Theorem | that if a prime p divides a product a@,@}. . .a, 
of integers, then p divides a; from some i. 
We will now prove a very important result. 


THEOREM 2 (Fundamental Theorem of Arithmetic). Each positive 
integer n can be factored uniquely as p,p.,. . .py, where each p; is a prime 
and py < Pz <... <Dr. 


Proof. We prove first by induction that n can be factored as a product 
of primes. If n is 1, then n is the product of the empty sequence of primes, 
SO we can get our induction started. Suppose n > | and every positive 
integer less than n can be factored into primes. If 1 is a prime, then 7 is 
the product of the single prime 7. Suppose 7 is not a prime. Then we can 
write m as uy, where u and y are integers greater than 1 and less thann. By 


PRIMES 55 


induction, we can factor u and »y into primes. Arranging the prime factors 
of u and v into one sequence gives the required factorization of n. 
Now we prove uniqueness, again by induction on n. Suppose 


N= PyP2---Pr=192-.--Us 
with each p; and q; a prime and p, < p2 <...<p,andq,<q2 <...<4qs. 
We want to show r = s and p; =q;, |< i<r.IfrorsisO, thenn = 1 and the 
result is trivial. Thus we may assume ¢ and s are both at least 1. Since p, 
divides q,. . .qs, it follows that p, divides some q;. But this implies that 
Dp, =q;. Similarly, g, =p; for some i. Then 


Pris Pi-UN SP, 
SO p, and gq, must be equal. Let m =n/p, . We have 


mM —P2..-Pr-2-- As 
and, by induction, r=s and p;=q;,2<i<r. OU 


It follows immediately from Theorem 2 that any positive integer n 
can be written uniquely in the form 


r 
Il p;*?, 
i=1 


where p; < py <...< py, each p; is a prime, and each e; is a positive 
integer. 

It is possible to write a single APL expression defining the vector of 
primes less than or equal to a given integer V. However, the space and CPU 
requirements for executing this expression are very large. A more efficient 
method of listing primes is the so-called sieve procedure, which we will 
illustrate with a small example. Let us assume that we wish to construct 
a vector P consisting of all primes not exceeding 25. We begin by setting 
P equal to the empty vector and letting @ be the vector of integers from 
2 to 25. 

P+10 
O<@<14125 
234567 8 9 10 11 12 13 14 15 16 17:18 19 20 21 22 23 24 25 
The first prime is the first component of @, or 2. We add 2 to Pand delete 
all multiples of 2 from Q. 
U<P<P, 2 
2 


+Q+(042|1Q)/@ 
3.5 79 11 13 15 17 19 24 23 25 


The next prime is the new first component of @. Again, we add this prime 
to Pand delete its multiples from Q. 


56 THE INTEGERS 


D<+P+P ,3 
2 3 
Li<Q<(0431Q)/Q 
5 7 11 13 17 19 23 25 


Repeating this process once more, we get 


H<+P<+P ,5 
2 3 5 

O<+Q<(0451Q)/@ 
7 11 13 17 19 23 


All of the remaining components of @ have no prime factors less than 7. If 
some component QL J] is not a prime, then Q@[ J] is a product of two or more 
primes each greater than 6. Since none of the components of @ exceeds 25, 
this is impossible. Therefore all of the remaining components in @ are primes, 
and we complete our computation with 


O+P<+P ,Q 
2? 3 5 7 111317 «19 «23 


The procedure ZPRIMES is CLASSLIB is based on this technique of sieving 
out the composite numbers from 141. 


ZPRIMES 25 
23 5 711 13 17 «19 23 


It is very important to be able to factor a given positive integer into a 
product of primes, and a great deal of effort has been spent on trying to 
develop efficient factoring algorithms. We can not go very deeply into this 
subject here. For a more complete discussion, consult Knuth [Vol. 2]. See 
also Section 3.3. 

Suppose we wish to factor 4693. Since no positive integer n can have 
more than one prime factor greater than./n, we try first to determine the 
prime factors of 4693 not exceeding ./ 4693 = 68.5... . One way is as fol- 
lows. 


P<ZPRIMES 68 
(O0=P|4693)/P 
13 19 


We see that 4693 is divisible by 13 and 19 and by no other primes less 
than 68. Computing the quotient 


4693713x19 
19 


we find that 4693 = 13 X 19 xX 19. If we wish to factor an integer n and a 


PRIMES 5/7 


list of primes is not readily available, the easiest thing to do is to divide n 
by 2 and by every odd number not exceeding./n. 

It turns out to be easier to decide whether or not a large integer is 
prime than it is to factor a large integer known to be composite. One test 
for primality is discussed in Section 3.3. The procedure Z7ACTOR in 
CLASSLIB looks for factors less than 50000. 


O<P<ZFACTOR 33263 
29 31 37 

x/P 
33263 

ZLEFACTOR 10*7 
22222 22 5 5 5 5 5 5 5 


With ZFACTOR, the complete factorization of any integer up to 2.5 X 10? 
can be obtained, and often the factorizations of much larger numbers can 
be found. Using the results of Section 3.3, it is possible to formulate more 
powerful factoring algorithms. 

We close this section with an important result concerning the solu- 
tions of simultaneous congruences. Let m,, ..., m, be positive integers 
and let a,,..., 4a, be integers. A solution of the system 


x =a, (modm,), 
x =a, (mod m,), 


x =a, (mod m,), 


of simultaneous congruences is one integer x that is a solution of each 
congruence. The existence and uniqueness of solutions to this system de- 
pend on the numbers gcd(m;, m;). We say that the integers m,,..., mM, 
are pairwise relatively prime if gcd(m;, m;) = 1 fori #7. 


LEMMA 3. Let m,,...,™,, r > 1, be a sequence of pairwise rela- 
tively prime nonzero integers. Set n = m,m,. . .m, and v; = n/m;. Then 
gcd(v,,...,¥-)=1 and lom(m,,...,m,) =n. 


Proof. Let d = gcd(v,,... ,v,). If d #1, then d is divisible by a prime 
p. Since d divides vy, and v, divides n, it follows that d divides n. Thus p 
divides some m;. Since m,,... , M, are pairwise relatively prime, p does 
not divide m; for j #1. But 


Vj=M,My...Mj_{Mj+1...M, 


and so p does not divide v;, contradicting our choice of p. We leave the 
proof that lem(m,,...,m,)=nasanexercise. 0 


58 THE INTEGERS 


THEOREM 4 (Chinese Remainder Theorem). Let m,,...,m,,r 21, 
be a sequence of pairwise relatively prime positive integers and let a,,...,a@, 
be integers. There exists an integer x such that x =a; (mod m;), 1 <i <r, 
and x is unique modulo n =m,my). . .m,. 


Proof. First, we prove uniqueness. Suppose x and y are both solutions 
of the simultaneous congruences. Then x = y (mod m;), 1 <i <r, and so 
x —y is a multiple of m; for each i. Therefore x — y is divisible by 
Icm(m,,...,m,), which by Lemma 3 is n. Therefore x = y (mod n). It is 
also clear that if x is a solution and y =x (mod n), then y is a solution, too. 

Now we prove that a solution of the simultaneous congruences exists. 
Let v; = n/m;, 1< i<r. By Lemma 3, gcd(v,,..., ,) = 1, and so there exist 
integers c,,...,C, such that 


, 
> Ci; =]. 

j=l 
Define x by , 

x= > AjiCiV;. 

fl 
We have vy; =0 (mod m;,) for j #i, and so c;v; = 1 (mod mj), while cjv; = 
O (mod m;) for j #i. Therefore 
x =a;cjvj =a; (modm;), I1<i<y, 

and thus x isasolution. 0 


As an example, let us solve the system 


19 (mod 27), 
7 Gmod 32). 


In the notation of Theorem 4 and its proof we have m, = 27, m, = 32, 
a, =19, anda, =7. 


M<27 32 A<19 7 
First, we compute n, v,, and y,. 

U<N<x /M O+V+N+M 
864 32 27 


Then we express | = gcd(v,, v,) aSC Vv, +C2V>. 


VL1] ZGcD VL2] C+.xV 
1 1 

U<C+h,S 
11 13 
We obtain one solution by 


O<+X%<+/AxCxV 
4u231 


‘PRIMES 59 


Reducing modulo M, 


N\X 
775 


we see that x = 775 (mod 864) is our solution. As a check, we can com- 
pute 


M|775 
19 7 


The procedure ZCHREM in CLASSLIB can be used to solve simul- 
taneous congruences. 


19 7 ZCHREM 27 32 
7795 


This procedure can solve several systems of congruences with the same 
vector of moduli. If / is a vector of nonzero integers and A is a matrix 
of integers, then X<A ZCHREM M isa vector of integers such that XL J J is 
congruent to ALI; ] modulo ML J] for all I and -¢, provided such a vector 
of integers exists. (See Exercise 9.) In addition, Z2CHREM computes the global 
variable M, which is the least common multiple of the components of ™. 


M<4 6 9 
O<+A+2 301 58314 
15 8 
31 4 
O<+X<+A ZCHREM M 
17 36 
QMo, |X 
15 8 
31 4 
M 
36 


EXERCISES 


1 Complete the proof of Lemma 3. 

2 Construct the vector P of primes not exceeding 100 using the sieve 
procedure described in the text. 

3. Write an APL expression for a vector listing the positive integers 
less than 100 that are relatively prime to 100. 

4 At a terminal, compute the factorizations of the following inte- 
gers into a product of primes without using the procedure ZFACTOR. 
(a) 6726. (c) 452521. 
(b) 2003. (d) 987027. 

5 Find the first prime larger than n for n = 10°, 107, 108, and 102 


60 THE INTEGERS 


6 Let W bea positive integer and assume Q<+SSORT P<ZFACTOR N. 
Write an expression for the vector F such that V is Qx.*Z. 

7 Let P be a vector whose components are distinct primes and let 
# and F be vectors of length oP with nonnegative integer com- 
ponents. Set M<Px.*i@ and N<Px.*F. Show that ’ ZGCD N is 
Px.,*ELF and M ZLCM Nis Px. *E[ F. 

8 Solve the following systems of simultaneous congruences. 

(a) x =3 (mod 8), (c) z=19 (mod 64), 
x =7 (mod 9). z =20 (mod 81), 
Z =21 (mod 125). 
(b) vy =47 (mod 81), 
y =101 (mod 125). 

9 Prove the generalization of the Chinese Remainder Theorem that 
states that there exists an integer x satisfying x =a; (mod mj), 
1 <i <r, if and only if gcd(m;,m;) divides a; — a; for all i and j 
and, if a solution exists, then it is unique modulo lcm(m,,..., M,). 

10 Describe all solutions of the following simultaneous congruences. 
7x =3 (mod 16), 
10x = 6 (mod 36), 
55x =30 (mod 75). 
11 Suppose 2? — 1 isa prime. Show that p is a prime. 
12 Prove that 2? — 1 is a prime for p = 2, 3, 5, 7, 13 but not for 
p=l1l. 
13. Show that the set of primes is infinite. 
14 Familiarize yourself with the procedure ZCHREM in CLASSLIB. 


*5. MULTIPLE-PRECISION ARITHMETIC 


The largest prime known, at the time this book was written, was 244497 — 1, 
a number with 13395 decimal digits. Primes of the form 2? — 1 are called 
Mersenne primes after the French mathematician Marin Mersenne (1588- 
1648). If m and n are positive integers, then 


gman _ y= (2m _4)(2@*—-l)m 4 Q(n-2)m 4 49m +1), 


Thus N = 2? — 1 can be a prime only when p is a prime. We remarked in 
the last section that it is easier to decide whether or not a number is prime 
than it is to factor a number known to be composite. For Mersenne numbers, 
that is, numbers of the form 2? — 1 with p a prime, a very powerful test 
for primality has been discovered. This test involves the sequence of integers 


MULTIPLE-PRECISION ARITHMETIC 61 


defined as follows. Let S, = 4 and set S,4, = (S,)* — 2 forn > 1. Thus 
S, = 14, S, = 194, and S, = 37634. 

THEOREM 1 (Lucas and Lehmer). Let N = 2? — 1, with p an odd 
prime. Then N is a prime if and only if N divides S, _,. 


Proof. A proof of this result may be found in Knuth [Vol. 2]. OU 


Let us apply the test in Theorem | to the first few Mersenne numbers. 
If p = 3, then N = 2° — 1 = 7, which does divide S, _, = S, = 14. Ifp =5, 
then N = 2° — 1 = 31 and S, = 37634 = 31 X 1214. To check the case 
p = 7, it seems necessary to compute S,, which is a number with 19 decimal 
digits. However, all we are really interested in is whether or not S, =0 (mod 
N), where N = 27 — 1 = 127. To decide this, we need only compute the 
terms S; modulo N for 1 < k < 6. This can be done as follows. 


O<V< 14+2%*7 O<S<WV| 24+5xS 
127 Te) 
S<4 O<S<N| 24+S5xsS 
O<S<N| 24+SxS 111 
14 O<S<+N| 24+S5x3S 
O<S<+N | 2+5xS 0 


67 


Thus S.¢ is divisible by 127. The same procedure may be used to show 
that S,) is congruent to 1736 modulo 2!! — 1 = 2047, so Theorem 1 tells 
us that 2047 is not a prime. Factoring, 
ZEFACTOR 2047 

23 89 
we see that 2047 = 23 X 89. 

Let us now jump to the case p = 37. On one terminal system the fol- 
lowing results were obtained for the first few terms in the sequence S, 
modulo 237 — 1. 


O<N<+ 14+2%*37 O<S<N | 2+SxS 
137438953471 37634 
S<4 O<S<WV | 24+S5xS 
O<S<WV | 2+SxsS 1416317954 
14, O<S<WV | 24+5xS 
O<S<«N | 24+5xS 111419319478 


194 


From this output it seems that S, is congruent to 111419319478 modulo 
237 — 1. However, the correct congruence is 


Se =111419319482 (mod 237 — 1). 


62 THE INTEGERS 


What went wrong? The answer is that we have exceeded the precision of 
this particular terminal system. The square of 1416317954 is 


200595 6546822746116, 


which has 19 digits. Unfortunately, our terminal system performs arith- 
metic Operations with an accuracy of only about 16 decimal digits, and the 
square was not correctly computed. 

Clearly, we cannot compute S,, Ss, ..., 536 modulo 237 — 1 using 
the previous approach. The intermediate results require greater precision 
than is possible using a single APL scalar to store our integers. In CLASSLIB 
there are several procedures for performing arithmetic operations using 
multiple precision, in which very large integers are represented by vectors 
of smaller integers. The procedures in CLASSLIB require that a large in- 
teger be represented by a character vector listing the decimal digits of the 
number with an initial plus or minus sign optional. For example, the fol- 
lowing character vectors are valid representations of multiple-precision 
integers. 

X<'375298010327654431'! 
Y<' 28971132410569821536'! 
Z<+'+9876543210123456789'! 


The procedures MPZSUM and MPZPROD compute sums and products of mul- 
tiple-precision integers. (The prefix MPZ stands for ‘“‘multiple-precision 
integer’’.) 


X MPZSUM Y 
~28595834400242167105 

X MPZPROD Z 
3706647015674438332992604563952882059 


There are also procedures for computing differences, powers, and remain- 
ders. 


Z MPZDIFF xX 
9501245199795802358 

X MPZPOWER 2 
140848596555896211951322046153933761 

X MPZREM Y 
302112394987224082 


Note that the right argument of MPZPOWER is the exponent given by an 
ordinary APL integer. The method used by MPZPOWER to keep the number 
of multiplications to a minimum is discussed in Section 3.1. The result of 
X MPZREM Y isthe reraainder when Y is dividea by X. The integer quotient 
in this division is saved in the global variable Q. 


MULTIPLE-PRECISION ARITHMETIC 63 


'4234567' MPZREM '9876543210123' 
1037288 


@) 
8000005 


1234567 ZREM 9876543210123 
1037288 

9876543210123 ZQUOT 1234567 
8000005 


The procedure MPZMAG computes the absolute value, while MPZSGN cor- 
responds to the monadic signum operation x. 


MPZMAG Y MPZSGN Y 


28971132410569821536 1 


We can use the multiple-precision procedures in CLASSLIB to complete 
our testing of the Mersenne number 23’ — | to see if it satisfies the Lucas- 
Lehmer criterion of Theorem 1. The reader should be aware, however, that 
a significant amount of CPU time will be required. If S is one term of the 
sequence S; modulo VW, then the next term modulo VN is 


S<N MPZREM (S MPZPROD S) MPZDIFF '2! 
Starting with 


U<N<('2' MPZPOWER 37) MPZDIFF '1! 
137438953471 
S<tyt 


we find that S3, is congruent to 117093979072 modulo 23” — 1 and so 
2°7 — ] is not a prime. There is another way to see this 


ZFACTOR 137438953471 
223 616318177 


Since 237 — ] turns out to have a rather small prime factor, ZFACTOR is 
able to handle it easily. For large primes p, however, the approach used in 
ZEACTOR is inferior to the Lucas-Lehmer criterion for deciding whether 
2? — | is prime. 

While it would divert us too much to explore the workings of the 
multiple-precision procedures in complete detail, it will be useful to ex- 
amine some of the basic ideas involved, since they also arise in the poly- 
nomial manipulation procedures discussed in Section 4.4. 

We normally represent integers using decimal notation. When we 
write the number 2573, we mean 2 X 102 +5 X 10? +7 X10+ 3. The 
number 10 is called the base or radix of the notational system. The choice 
of 10 as a base came about mainly for anatomical, not mathematical, rea- 
sons. Our methods of measuring time and angles are derived from a Bab- 


64 THE INTEGERS 


ylonian notational system with base 60. The internal representation of 
numbers in computers usually involves bases of 2, 8, or 16. Normal input 
and output of APL terminal systems is in decimal notation, but the encode 
and decode operations allow us to work with any base. 


10 10 10 1072573 
2 5 7 8 
(508)7T2573 
05 01 5 
(1392)7T2573 
O10%100000121 021 
8iit 7 7 6 
1022 


When different bases are used in the same discussion, the base is written 
as a subscript. The preceding computations show that 


2573149 = 5015, = 101000001101, and 1776, = 102249. 


Suppose we are given the vectors X and Y of the decimal digits of two 
positive integers x and y. How can we describe the vectors of digits for 
x + y and xy? To answer this question, let us consider an example with 
x = 9318 and y = 3728. 

U<xX<(4910)T9318 
93 1 8 
O<¥<(4010)T3728 
3 7 2 8 
If we form 


O<2<X+Y 
12 10 3 16 
we get a vector Zsuch that 1012 isx ty 

1OLZ 931843728 
13046 13046 


but Z is not a valid vector of digits in the base 10 system. We have not 
allowed for any carry from one digit position to the next position on the 
left. The vector of digits to be carried is 


Z ZQUOT 10 
1101 


and the carry may be performed as follows. 


O<Z<(0,10|/2Z)+(2Z ZQUOT 10),0 
13 04 6 


MULTIPLE-PRECISION ARITHMETIC 65 


To compute the product of x and y by hand, we would probably write 


9318 
3728 
74544 
18636 
65226 
27954 
34737504 


In carrying out this calculation, we had to multiply each digit of x by 
each digit of y. This suggests that to formulate a multiplication algorithm 
in APL we should first form the outer product 


O<U<Yo.xX 
2/7 9g 3.24 
63 21 7 56 
18 6 2 16 
72 24 8 64 


The rows of U correspond to the four intermediate rows in the hand com- 
putation, but in reverse order. However, in the hand computation the 
intermediate rows are shifted horizontally. We can perform this shifting 
as follows. 


O<«<V<0 1 °#2 30U,4 300 
27 9 3 24 0 0 0 
0 63 21 7 56 0 0 
0 018 6 216 0 
0 0 072 24 8 64 


Adding the columns of V and making the necessary carries, 


O<+W<+74V 
27 72 42 109 82 24 64 
O<W+(0,10|W)+(W ZQUOT 10),0 
21461217 4 10 4 
O<W<(0.,10|W)+(W ZQUOT 10),0 
03 4 73 7 5 0 4 
9318x3728 
34737504 


we obtain the vector of digits of xy. Note that adding the carry digits can 
cause further carries and that our method of handling carries sometimes 
introduces leading zeros in the digit vector. 

It is possible to formulate algorithms for subtraction and division in 
decimal notation in much the same way as we have described decimal addi- 


66 THE INTEGERS 


tion and multiplication. We could write APL procedures for doing mul- 
tiple-precision arithmetic using decimal notation, but this would be a waste 
of both time and space. (Why?) The multiple precision procedures in 
CLASSLIB use a base of 10°. The character vector arguments are converted 
to vectors of ‘‘digits’’ in base 10° notation, the appropriate arithmetic oper- 
ation is performed, and the result is converted back to a character vector. 

There is another method for performing arithmetic operations with 
large integers. Let m,,...,m, be a sequence of relatively prime integers and 
let nm =m ,m,...m,. By the Chinese Remainder Theorem, an integer x with 
—n/2 <x <n/2 is uniquely determined by the vector (x,, ...,X,), where 
x; is the remainder when x is divided by m,;. If y is another integer and 
(Vi, -+-., yy) is the corresponding vector of remainders, then the vector 
of remainders for x + y and xy are easily computed. (See Exercise 8.) This 
method of modular arithmetic is very efficient for carrying out addition, 
subtraction, and multiplication, provided the results are known in advance 
to lie in the interval (—n/2, n/2). However, it is much harder to perform 
division and to decide which of two numbers is larger using this modular 
representation. A more detailed discussion of the advantages and disad- 
vantages of modular arithmetic can be found in Knuth [Vol. 1]. 


EXERCISES 


1 Arrange the following integers in increasing order: 1111101010,, 
17503, 100119, 6B2,,, 3E7,¢.. (For bases 12 and 16 the “digits” 
A, B, C, D, E, and F have decimal values 10, 11, 12, 13, 14, and 
15, respectively.) 

2 Write APL procedures DECSUM and DEC PROD for performing addi- 
tion and multiplication of positive integers in decimal notation. 
That is, if X and Y are the vectors of decimal digits for two positive 
integers x and y, then X DECSUM Y and X DECPROD Y should 
be the vectors of decimal digits for x + y and xy, respectively. The 
procedures should be in the spirit of the examples in the text. 

3 Describe how to perform long division in decimal notation using 
APL operations on the vectors of decimal digits. 

4 Write a procedure LUCASLEHMER such that if P is an odd prime, 
then LUCASLEAMER Pis1 if N+ 1+2*P is a prime and 0 if IV is 
composite. Show that the first nine Mersenne primes are the num- 
bers 2? — 1, with p = 2, 3, 5, 7, 13, 17, 19, 31, 61. (Warning! 
This could be expensive if carried out in the most obvious manner.) 


5 Use the procedure MPZREM and the Euclidean algorithm to compute 
the greatest common divisor of 


MULTIPLE-PRECISION ARITHMETIC 67 


533379 186147253323339050654336402 1456194114 
and 
6175345284192902200147081485889446083450374. 


(Use the result of Exercise 2.9 to estimate the amount of CPU 
time that will be required .) 


Compute the exact value of the term S, in the sequence S, of 
the Lucas-Lehmer test. 


Although the multiple-precision procedures in CLASSLIB are 
designed to manipulate integers, they can be used to compute with 
decimal fractions as well. Explain how this can be done. As an 
example, compute 2 to 30 decimal places. 


Describe how to perform addition and multiplication using modular 
representations. That is, if / is a vector of positive integers and 
A and B are integers, show how to construct M|A+B and M|AxB 
from X<+M|A and Y<M|B. 


GROUPS 


In this chapter we encounter the concept of a group, our first example of 
an abstract algebraic system. Groups occur in almost every branch of mathe- 
matics and are frequently used to describe the symmetry present in some 
mathematical structure. We will investigate many examples of specific or 
“‘concrete’’ groups. However, in order to isolate the common features that 
all of our examples share, we will also study ‘“‘abstract’’ groups, sets with a 
particular type of binary operation. Throughout this chapter the APL in- 
dex origin is assumed to be O unless there is an explicit statement to the 
contrary. 


1. BINARY OPERATIONS 


In Section 1.3 we defined a binary operation on a set X to be a function 
from X X X to X. We also adopted the convention of using the symbol e 
to denote a typical binary operation and of writing x @ y for the image of 
the pair (x, y) under @. We usually refer to x@y as the product of x and jy, 
even though @ may not have anything to do with ordinary multiplication of 
numbers. We say that © is commutative on X ifx ®@y=y ©x forall x and y 
in X and that @ is associative if x @ (y © Zz) = (x @ y) @z for all x, y, and z 
in X. 

A binary operation on 1 is an J-by-N matrix B whose entries lie in 
1. Such an operation B is commutative if and only if B is equal to QB and 
B is associative if and only if BLIT;BL7;K]J] and BLBLI;J71]; KJ] are the 
same for all 7,7, and K, that is, if and only if the arrays BL ;BJ and BLB; J 
are equal. Thus the APL propositions corresponding to the assertions that 
Bis commutative and that B is associative are \/ ,B=Band 


A/,BL;BlJ=BLB;3], 
respectively. In EXAMPLES the matrix G6 is a binary operation on 16. 
68 


BINARY OPERATIONS 69 


G6 OI0+0 
O12 3 4 § A/,G6€16 
104 5 2 83 1 
£23015 4 
3 25 4 0 1 
4 5 10 3 2 
5 4 3 21 0 
Substituting G6 for B in these two propositions, 
A/,G6=8G6 A/,G6L;G6]=G6[G6; ] 


0 1 


we find that G6 is not commutative but is associative. 


If X is a small finite set, it is often convenient to describe a binary 
operation on X by a table. For example, if X = {x9,x,,X,}, then the table 


defines the binary operation @® on X in which x9 © X9 = X,,X, @X; =X, 
and so on. We call the matrix 


U<7<3 391 0121032 2 2 


NOME a 
DFO 
Noma. 


the binary operation table for @ with respect to this particular numbering 
of the elements of X. In general, if @ is a binary operation on the N-element 
set XY and we have numbered the elements of X from O to W-1 (or 1 to Nin 
Origin 1), then the binary operation table for @ with respect to the given 
numbering of X is the W-by-WV matrix T such that TLI;J7]isthe number of 
the product of the th and Jth elements of X. Clearly, 7 is itself a binary 
operation on 1X. 

Let n be a positive integer. In Section 2.3 we defined two binary 
Operations + and X on Z,, the set of congruence classes modulo n. It is 
easy to compute binary operation tables for these operations. The most 
natural numbering of the elements of Z, is to let the ith congruence class 
be [i], the class containing i, 0 <i<.n. Using this numbering, the binary 
operation table for + on Z, is 


70 GROUPS 


LU<Z6<6/|(16)°.416 


O12 3 4 5 
123 4 5 0 
2345 0 1 
3 4 5 0 1 2 
4501 2 3 
9 012 3 4 


THEOREM 1. The binary operations + and X on Z, are commutative 
and associative. 


Proof. Let a and b be in Z. Then, in Z,, we have 
[a] + [b] ={at+b] =[(b +a] =[b] +[al], 
[a] X [b] = [aX b] = [bX a] = [Bb] X [a]. 


Thus the commutativity of + and X on Z, follows from the commuta- 
tivity of the corresponding operations on Z. Associativity is proved in the 
same way and is left to the reader. [J 


Let X be any set. Composition of functions is an associative binary 
operation on X*. The set X(X) of all permutations of X is a subset of 
X* and, by Corollary 1.3.3, the composition of two elements of D(X) is 
again in 2(X). Thus we may restrict the operation of composition to 2(X) 
and get an associative binary operation on Y(X). 

If @ is a binary operation on a set XY, we can define a new binary oper- 
ation © on 2* by 


A@QB= {aeblacA, be B}. 


If B = {b}, we often write A © b instead of A © {b}. As noted earlier, the 
product x @ y is sometimes abbreviated xy when the operation @ is clear from 
context. In this case A © B is written AB. Even when the symbol @ is not 
omitted, it is standard practice to write @ for ©, although these are really 
different binary operations. Thus, if A and B are subsets of Z, then A +B 
and A + 1 denote {a+blaeA,beBhsand {a+1|ae A}, respectively. 


THEOREM 2. If © is an associative binary operation on X, then © is 
associative on 2% . 


Proof. The proof is straightforward and left as an exercise. [ 

A semigroup is defined to be a pair (X, @) of a set X and an associative 
binary operation @ on X. As noted in Section 1.3, the function @ determines 
its domain X X X, which in turn determines X. Thus X is determined by 
®, and so we could just as well define the semigroup to be the binary oper- 
ation ®, X being redundant. This is, in fact, almost never done. Actually, 
it is standard practice to refer to the semigroup (X, @) by the symbol X, 


BINARY OPERATIONS 71 


even though a given set may have many associative binary operations on 
it. Of course, this should only be done when the binary operation is clear 
from context. 

If @ is an arbitrary binary operation on X, then an expression such as 
x @ y@zZz @ w is ambiguous, since we need to specify in which order the 
Operations are to be carried out. (Traditional mathematical notation has 
no Right-to-Left Rule!) Parentheses must be added to indicate whether we 
mean xX @(y @©(z @w)), (x @ (y @ z)) @ w, or one of the other three possi- 
bilities. However, in a semigroup parentheses are unnecessary. If @ is asso- 
ciative, then 


X@(ye(ZOw))=xeO((yOz)ew)=(xe(yez))eow= 
((x @ y)@z)@w=(x@y)e(z ew). 


In fact, an induction argument shows that any two ways of adding paren- 
theses to a; @ a, @...@a, yield expressions with the same value. In par- 
ticular, we can define x” to be the product x ®@x @...@®x, with n factors 
whenever nv is a positive integer. It is easy to check that the laws of exponents, 
x™ ex" =ym +n 
(x y" = xm ne 
hold for any positive integers m and n. 

An element e of X is a left identity element for the binary operation 
@on X if e © x =x for all x in X. Similarly, f is a right identity element if 
x @ f = x for all x in XY. An element that is both a left and a right identity 
element is called a two-sided identity element. Clearly, if the operation ® 
is commutative, then the notions of left, right, and two-sided identity ele- 
ments coincide. 

If B is a binary operation on 1M, then an element £ of iW is a left 


identity if and only if A/BLE;]=1N and a right identity if and only if 
A/BL;EJ=1WM. For G6, 


G6 A/G6L03;]:=16 
012 3 4 5 1 
104 5 2 8 A/G6L;0J=16 
23 015 4 1 
32544 01 
4 5 10 3 2 
5 4 3 21 0 


0 is a two-sided identity. 

The integers 0 and 1 are two-sided identities for + and X on Z, respec- 
tively. The operation — on Z has a right identity, 0, but no left identity. The 
identity function on a set X is a two-sided identity for composition of func- 


72 GROUPS 


tions on X. The congruence classes [0] and [1] modulo 7 are two-sided iden- 
tities for + and :X on Z,,, respectively. 

A binary operation may have many left identities or many right iden- 
tities but not both. 


THEOREM 3. Let © be a binary operation on X and let e be a left 
identity and f a right identity for ©. Then e = f, and soe is a two-sided iden- 
tity. Moreover, e is the only identity element for @, right, left, or two-sided. 


Proof. Consider the element e @ f. Since e is a left identity, e @ f= f. 
However, since f is a right identity, e @ f=e. Thuse =f and e is a two-sided 
identity. The same argument shows that any right identity is equal to e and, 
since e is also a right identity, that any left identity is also equal toe. L 


A monoid is a semigroup (X, @) that has a two-sided identity element 
e. By Theorem 3, e is unique. The semigroups (Z, +), (Z, X ), (Z,, +), (Zn, X), 
(X*, ©), and (D(X), oc) are all monoids, where o denotes composition of 
functions on the set X. 

Let (X, @) be a monoid with identity element e and let x be in X. An 
element y of X is called a left inverse for x if y @ x =e. Similarly, a right 
inverse for x is an element z such that x © z =e. A two-sided inverse for x 
is an element of X that is both a left and a right inverse for x. If © is com- 
mutative, then left, right, and two-sided inverses are the same. Every in- 
teger x has an inverse in the monoid (Z, +), but only 1 and —1 have inverses 
in (Z, X ). However, in (Q, X) every nonzero rational number has an inverse. 
The inverse of the congruence class [7] in (Z,,, +) is [—i]. If fis a permutation 
of the set X, then f~! is a two-sided inverse for fin (X*, 0). 

Suppose the matrix B defines a monoid on iW with identity element 
E. Then J in iW has a right inverse if and only if some component of BL J; J 
is equal to £, that is, if and only if v/BLI; ]=£. In fact, BLI; J=F is the 
characteristic vector for the set of right inverses of IT. In the monoid (16, G6) 
every element has a two-sided inverse. The inverses are listed in the vector 
INV6 in EXAMPLES. 


INV6 
O124 3 5 


The inverse of J is TNV6LI]. (See Exercise 16.) 

Theorem 3 tells us that a semigroup cannot have many right identities 
and also many left identities. Similarly, we can show that an element of a 
monoid cannot have many right inverses and also many left inverses. 


THEOREM 4. Let (CX, ©) be a monoid and let x be in X. Suppose y is 
a left inverse of x and z is a right inverse of x. Then y = z. If u and »y are in 
X and each has a two-sided inverse, then so does u @ yp. 


BINARY OPERATIONS 73 


Proof. Let e be the identity element of (X, ©) and suppose y @ x = 
x @z =e. By the associativity of ©, we have 


yp=ye@cec=ye(xez)=(y Ox) Oz =e O27 =Z. 
If s and ft are two-sided inverses for u and y, respectively, then 
(U@V)/@(fOsl/=(US(VOL))Os=(USeC)OS=UOS =e. 


Similarly, (tf @ s) @(u@v)ise, and sot @s is a two-sided inverse foru@yv. [] 

The first part of Theorem 4 implies that in a monoid, if x has a two- 
sided inverse y, then y is the only inverse of x, left, right, or two-sided. 

In proving the second part of Theorem 4 we found that the inverse of 
u @ y was the product of the inverse of v and the inverse of u. This fact is 
often stated as “the inverse of a product is the product of the inverses in the 
opposite order.’’ It should also be pointed out that if y is a right inverse for 
x, then x is a left inverse for y. Thus if y is the two-sided inverse of x, then 
x is the two-sided inverse of y. 

Let us now determine which elements in the monoid (Z,, X) have 
inverses. 


THEOREM 5. Let n be a positive integer. The congruence class [u] 
has an inverse in (Z,, X) if and only if gcd(u, n) = 1. 


Proof. The class [x] is an inverse for [uv] if and only if [u] xX [x] = 
[1] or, equivalently, ux = 1 (mod n). By Theorem 2.3.3, this congruence 
has a solution x if and only if gcd(u, n) divides 1, that is, gcd(u, n)=1. Q 

For example, suppose we wish to find the inverse of [8] in (Z3,, X) 
or, in other words, we want to solve the congruence 8x =1 (mod 37). Using 
the method of Section 2.3, we express 1 = gcd(8, 37) as 8x + 37y. 


8 ZGCD 37 _ &£.s 
1 14 3 
We find that x = 14 is a solution of the congruence, and so [14] is the in- 
verse of [8]. As a check, we can compute 


37|8x14 


Let U,, be the set of elements in Z, with multiplicative inverses. By 
Theorem 5, 


U, ={{x] lgcd(x, n) = 1 }. 


The product of any two elements of U,, is again in U,. This can be seen 
either by applying the second part of Theorem 4 or by observing that the 
product of two integers that are relatively prime to n is itself relatively prime 
to n. Also, if [x] isin U, and [y] is the multiplicative inverse of [x], then 


74 GROUPS 


{y] is in U, too. Since multiplication in Z, is associative and U,, clearly 
contains [1], it follows that (U,,, X ) is a monoid in which every element has 
a two-sided inverse. 

As the final topic of this section, we will discuss the computation of 
powers in a semigroup (X, @). Given an element x of X and a positive in- 
teger n, we have defined x” to be the product x ©... ® x, with n factors. 
We can also define x” recursively by setting x! = x and x"*! =x @ x” for 
n > 1. If nis large and the amount of work necessary to compute the product 
of two elements of X is also large, it is quite inefficient to compute x” using 
this definition. For example, to compute x>’ requires calculating 56 products. 
However, if we compute x2 = x @ x, xt =x? @ x?, x8 =x4 @xt, x6 =x% © 


x®, and x32 = x!© @ x!® then we can get x5? asx @ x® @ x!® @ x*?, having 
calculated only 8 products. In general, let a, ... a, be the binary (base 2) 
representation for nm. We may compute x, x2,...,x2’ withr products. Then 


x” is the product of the terms x2’ for which a; = 1. An alternative recursive 
formulation of this same approach is to say that 


x2M = (x2y", 
x2m +] =xe (x? y”, 


for all m > 0. This method of calculating powers will be referred to as the 
binary power algorithm. Exercise 23 describes an APL procedure that com- 
putes powers in (R, X) using the binary power algorithm. The binary power 
algorithm is particularly useful for calculating powers in the monoid (Z,, X). 

Even the binary power algorithm does not always compute powers 
using the smallest number of multiplications. The first exponent for which 
an improvement is possible is 15. The binary power algorithm calculates 
x as x @ x? @ x* @ x®, which requires 6 products. However, forming 
y=x> =x ex ex and then x =y> =y @ (y”)? needs only 5 products. 
For a more thorough discussion of the computation of powers in a monoid, 
consult Knuth [Vol. 2]. 


EXERCISES 


1 Show that a and v are commutative and associative binary oper- 
ations on {0, I}. 

2 Complete the proof of Theorem 1. 

3 Prove Theorem 2. 

4 Prove that in any semigroup the identities x” @ x” =x™*" and 
x” Y =x™" hold for all positive integers m and n. 


5 There is exactly one binary operation @ on the empty set. Is (, @) 
a semigroup? Is it a monoid? 


13 


14 


15 


16 


17 


BINARY OPERATIONS 75 


Let X be a set and let WCX) denote the set of all vectors, including 
the empty vector, with components in_X. We will use APL notation 
for all elements of WCX). Suppose that U and V are in W(X); then 
U,V is in W(X). Show that (WCX),,) is a monoid. [The elements 
of WCX) are often called words in the elements of X.] 


List the elements of U, for2 <n < 6. 


Prove that in a semigroup (CX, ®@) it is not necessary to put paren- 
theses in expressions such asa, @a, @...@4d,. 

Let X be a set and let f be an element of the monoid X¥* . Show 
that f is injective if and only if f has a right inverse in X* and that 
f is surjective if and only if f has a left inverse. 

Find the inverse of [27] in (Z,3, X ). 

Show that the number of multiplications needed to compute x” 
using the binary power algorithm does not exceed 2 log, n. 

The workspace EXAMPLES contains matrices B1, B2, and B3, 
which are binary operations on 16. Determine which of these 
operations are commutative and which are associative. 

Let WV be an integer greater than 1. Show that each of the fol- 
lowing matrices is a binary operation on 1¥/ (in origin 0). Which 
of these operations are associative? 

(a) (1WV)o.f iW 

(b) |Ci1V)o.-iW 

(c) Ni CiN)o.tiW 

(d) N\|CiV)o. xi 

(e) L¢wW)o.tit+i1W 

Construct the binary operation table for (Z,, X) using the num- 
bering of the elements of Z, given in the text. 


Let W<5 4 3 2 1 0. We can number the elements of Z, so that 
the 7th element is the congruence class containing WL TZ]. Construct 
the binary operation tables for (Z,, +) and (Z,, X) using this num- 
bering. 

Suppose the matrix B defines a binary operation on 1 with two- 
sided identity E and suppose YV is a vector of length WV with com- 
ponents in 14. Write an APL proposition corresponding to the 
assertion that VL I] is a right inverse for I. Do the same for left 
inverses and two-sided inverses. Use this approach to show that 
the vector INV6 in EXAMPLES does, in fact, list the two-sided 
inverses of the elements in (16, G6). 


Write an APL proposition corresponding to the assertion that the 


76 


GROUPS 


18 


19 


20 


21 


22 
23 


24 


25 


inverse of the product of two elements of (16, G6) is the product 
of the inverses in the opposite order. 


Let (14, B) be a monoid with identity element &. Write APL ex- 
pressions for the characteristic vectors of the following subsets 
of 1X. 


(a) The set of elements with right inverses. 

(b) The set of elements with left inverses. 

(c) The set of elements with two-sided inverses. 

The columns of the matrix F<«3 3 317127 list all of the maps 
of X = {0, 1, 2} into itself. Construct the binary operation table 
? for X* using this numbering of the elements of X*. 


Let / be as in Exercise 19 and let P be the vector of length 6 such 
that the columns of | ; PJ are the six permutations of X = {0, 1, 2}. 
Using P and the table 7 constructed in Exercise 19, write an ex- 
pression for a binary operation table for the monoid (2CX), °). 


Let WV be an integer greater than 1] and let P be a vector listing the 
set of primes dividing WV. Write an APL expression defining a vector 
that lists all positive integers less than W and relatively prime to NW. 


Construct a binary operation table for (Ujo9, X ). 
Enter the following procedure definition at a terminal. 


VY<X POWER WN 
[1] Y<1 
[2] LOOP:7+(N=0)/0 
[3] >(0=2|N)/EVEN 
C4 J Y<xxyY 
[5] EVEN: X<XxXx 
[6] N<LN+2 
[7] +LOOPV 


Compute 2 POWER 23 and compare it with 2x23. Show that 
POWER is an implementation of the binary power algorithm for 
(R, X). 

The procedure MPZPOWER described in Section 2.5 uses the 
binary power algorithm to compute powers of integers using mul- 
tiple precision. Calculate 2! exactly with MPZPOWER. 

The procedure ZVPOWER uses the binary power algorithm to com- 
pute powers in (Z,, X) where n is given by the global variable 
N. If X and M are integers with M nonnegative, then X ZNPOWER M 
is congruent to X*M modulo WV. Use ZNPOWER to show that 


2990000 = 1(mod 1000001). 


GROUPS 77 


26 Write a procedure using the binary power algorithm to calculate 
powers of maps of 1¥ into itself. That is, given a vector F with 
components lying in 1p f and a nonnegative integer WV, the pro- 
cedure should calculate the vector G that would be obtained by 
composing / with itself V-1 times. (If ¥ = O, then G should be 
1p.) 

27 Write a recursive procedure for calculating powers in (R, X ) based 
on the observation that x?” = (x”)? and x?7"+*! = x”) for 
m > 0. 

28 Let f:N—N mapitoit+/,i=1,2,.... Show that f has infinitely 
many right inverses and no left inverses in the monoid NN. 


2. GROUPS 


In our investigation of sets with binary operations in Section 1 we came 
across several monoids in which every element has a two-sided inverse. The 
monoids (Z, +), (Zn, +), (2(X), ©), (Un, X), and (16, G6) all have this 
property. Such monoids are called groups. The rest of this chapter will be 
devoted to the study of groups. 

Since groups play such an important role, not only in algebra but in 
all of mathematics, it is worthwhile restating the definition. A group is a 
pair (G, @) consisting of a set G and a binary operation @ on G such that: 


1. @ is associative. 


2. There exists an element e in G such that e ®x =x @ e =x for all 
x inG. 


3. For eachx in G there is anelement y of G such thatx ®@y=y@x=e. 


By Theorem 1.3 the identity elements e is unique and by Theorem 1.4 
the inverse y of an element x of G is also unique. As we have remarked 
before, it is conventional to refer to G as the group as long as the oper- 
ation @ is clear. 

An W-by-N matrix G is called a group table if (1, G) is a group. Sup- 
pose that (G, @) is a group and that G is a finite set. It is easy to show that 
the binary operation table G for (G, @) relative to any numbering of the 
elements of G is a group table. We say that G is a group table for (G, @). 
In CLASSLIB there are several procedures for working with groups defined 
by group tables. The names of these procedures all begin with the prefix 
GT. Most of these group table procedures perform computations in the 
group described by the three global variables GTABLE, GTIO, and GTINV. 
The matrix GTABLE is the group table defining the binary operation. Since 


78 GROUPS 


GTABLE cannot be a group table both in origin O and in origin 1, we need 
to know which origin to use with GTABLE. The variable GTIO gives the 
correct origin. In addition, all of the procedures assume that G7JO is the 
identity element for GTABLE. The [th component of the vector GTINV 
is the inverse of J in the group defined by GTABLE. We will refer to the 
group described by GTABLE, GTIO, and GTINV as the current abstract 
group. 

In order to make (16,G6) the current abstract group, we use the 
procedure GT INIT to initialize GTABLE, GTIO, and GTINV. 


GTINIT G6 GTIO 
A/,GTABLE=G6 0 

1 GTINV 

O12 4 3 5 


The single argument to GTINIT is the group table, which is copied into 
GTABLE. The variables GTIO and GTINV are then computed. In order 
to save time, GTINIT does not check whether GTABLE really does de- 
fine a group. In order to test whether a given matrix is a group table for 
the appropriate choice of index origin, it is necessary to use GTCHECK. 


GTCHECK G6 
1 


The value returned by GICHECK is 1 or O, depending on whether the 
argument is or is not a group table, with the identity element being the 
smallest entry in the table. Executing GTCHECK does not affect the defini- 
tion of the current abstract group; that is, GTABLE, GTIO, and GTINV 
are not changed. 

If A and B are arrays whose entries are elements of the current ab- 
stract group, then A GTPROD B is the entry-by-entry product of A and 
B in the group. 


2 GTPROD 4 0 GIPROD 16 
is) O12 3 4 5 
GTABLEL234] (16) GTPROD 0 
is) O12 3 4 5 
3 GIPROD 1 GTINV GTPROD 16 
2 0000 0 0 
2 3 GTPROD 4 1 (16) GTPROD GTINV 
5 2 0000 0 0 


Note that arguments for GZPROD that have only one entry are expanded to 
match the other argument. 


GROUPS 79 


The workspace EXAMPLES contains two other origin O group tables, 
G24 and G60, which are 24-by-24 and 60-by-60 matrices, respectively. 


oG24 0G60 
24 24 60 60 

GTCHECK G24 GTCHECK G60 
1 1 


The matrix G60 is too big to permit the associative law to be verified di- 
rectly by comparing G60L;G60] and G60LG60; J]. The approach used in 
GTCHECK is much more efficient in both time and space. (See Exercise 
8.21.) The vectors INV24 and INV60 list the inverses of the elements in 
(124, G24) and (160, G60), respectively. 


oI NV24 GTINIT G24 

24 13 GTPROD INV24L13] 
oI NV60 0 

60 


Although group tables are an important aid as one is beginning the 
study of groups, in practice one does not perform computations in a par- 
ticular group G by first constructing a group table for G and then working 
with that table. The group G may have infinitely many elements and, even 
if G has only finitely many elements, a group table for G may be too large 
to fit into any computer. For example, to show that [471562] + [823495] = 
[307403] in the group (Zog7654, +) hardly requires the construction of a 
group table. 

Let (G, @) be a group. If the operation @ is commutative, we say the 
group is commutative or abelian. [The term ‘“‘abelian’’ is derived from the 
name of Norwegian mathematician Niels Abel (1802-1829).] A nonabelian 
group is a group that is not abelian. If G is a finite set, (G, @) is said to be a 
finite group and |G| is called the order of the group. Thus (Z,,, +) is abelian, 
finite, and of order n. The group (U,, X ) is also finite and abelian. Its order 
is usually denoted by ¢(n), and @¢ is referred to as the Euler phi-function. 
[Leonhard Euler (1707-1783) was a very important Swiss methematician. | 
The function @ is important in number theory. We will exhibit a nice for- 
mula for ¢(n) in Section 6. 

The group X(X) of all permutations of the set X is called the sym- 
metric group on X. Recall that if X is the set 1N, we write Dj instead of 
x(X). 

THEOREM 1. The group 2, has order n! and is nonabelian if n > 3. 


Proof. The order of 2, was determined in Exercise 1.3.9. Suppose n > 
3. Let f be the element of 2, that interchanges 0 and | and fixes 2,.. 


° Ff 


80 GROUPS 


n — I, and let g be the element interchanging 1 and 2 and fixing the re- 
maining points. Then f o g maps 0 to 2 whileg o fmapsOto 1. Thusfog# 
go f,andso 2d, isnonabelian. (0 


THEOREM 2. Suppose (G, @) is a group and wu, x, y are in G. Then 

(a) Ifuex=uey, thenx=y. 

(b) Ifxeu=y eu, thenx =y. 

Proof. Let e be the identity element of G and let v be the inverse of 
u.lfue@x=uey, then 


X=C@x=(VEOU)Ox=VeE(Uex)=VEO(USy)=(VOU)OV=ee@y=y. 
Part (b) is proved in the same way. U. 


Parts (a) and (b) of Theorem 2 are called the Jeft and right cancellation 
laws, respectively. 


COROLLARY 3. Let (G, ©) be a group, let u be a fixed element of G, 
and let v be the inverse of u. The maps R,,:G—>G and L, :G—>G defined 
by R,,:x Fex @u and L,,:x/v @ x are permutations of G. 


Proof. By Theorem 2, the maps R, and L, are injective. Let y be an 
element of G. Then (y @ y)R, =(y@v)@u=ye(v @u)=y. Therefore 
R,, 1s surjective also. Similarly, L, maps u @ y to y, and so both R, and 
L, are permutations of G. 0 


The reason for defining xL,, to be v @ x instead of u @ x is explained 
in Exercise 3. Jt follows from Corollary 3 that if G is an W-by-V group 
table, then each row and each column of G is a permutation of 1. 

There are two standard notational conventions used in connection 
with groups. In multiplicative notation the symbol for the group operation 
is omitted so that x @ y is written as xy and called the product of x and 
y. The identity element is denoted by e or, more commonly, by 1. The 
inverse of x is written x “!. By Theorem 1.4, we have the formula (xy) 7? = 
y ‘tx. If n is a positive integer, then x” is the product xx...x, with 
n factors, and x ™ is defined to be (x7! )*. Finally, x®° is defined to be 1. 

THEOREM 4. Let G be a group written multiplicatively, let m and n 
be integers, and let x and y be elements of G. Then 


mtn 
g 


x™x" =x 
(x™ jh ayn, 
If x and y commute, then (xy)” =x" y"”. 


Proof. The proof is left to the reader. The case in which m and n 
are both positive has already been discussed in Section 1. [J 


As a consequence of Theorem 4, we have the fact that the inverse of 


x" isx 


GROUPS 81 


Groups may also be written additively. In additive notation the group 
operation is denoted by a plus sign, even though the operation may have 
little to do with ordinary addition. The value of x + y is called the sum 
of x and y, and zero is used to denote the identity element. The inverse 
of x is written —x and the sum x + (—y) is abbreviated x — y. If n isa posi- 
tive integer and x is a group element, then the sum x +x +...+x, with 
n summands, is written nx and called the nth multiple of x. The multiple 
(—n)x is defined to be n(—x), and Ox is defined to be the identity element 0. 

Additive notation is used almost exclusively for abelian groups such 
as (Z, +) and (Z,, +). Multiplicative notation is used for both abelian and 
nonabelian groups. We will use multiplicative notation for the symmetric 
groups X(X). In making definitions and stating theorems about groups, 
we will normally write the groups multiplicatively. 

There is an entertaining pastime indulged in by many students of group 
theory; it consists of trying to find the weakest assumptions that can be 
made about a binary operation @ on a set X that imply that CX, @) is a group. 
The next result illustrates one attempt of this kind. 


THEOREM 5. Let (G, @) be a semigroup with a left identity e such 
that relative to e every element of G has at least one left inverse. Then 
(G, @) is a group. 

Proof. Since e is a left identity, e @ e = e. Let x be any element of 
G such that x @ x =x. If y isa left inverse of x relative to e, then 

e=ye@ex=ye(xex)=VOx)@ x =e@x =X. 
Thus e is the only element of G that is equal to its square. Now let u be 


any element of G and let vy be a left inverse for u. Then, since v @ u =e, 
we have 


(U@v)/@(Uu@yv)=UeE((VEU)eOvy)=uUe(e@rv)=uUey, 
Therefore u @ vy must be e, and so »v isa right inverse for u too. Also, 
Uus@e=US(VOU)=(UOVY) Ou=eCOu=uU, 


and so e is a right identity. Thus we have a two-sided identity relative to 
which two-sided inverses exist, and so (G, @) isagroup. [L 


It seems that it is very difficult to weaken the associative law in any 
significant way and still have a group. See Exercise 12 in this regard. 


EXERCISES 


1 Show that each of the following isa group. 


(a) ({1, —1}, xX). 
(b) (22Z, +). 


82 GROUPS 


10 


11 


*12 


13 
14 


15 


(c) (Q— {0}, xX). 
(d) (ZX Z, ©), where (x1, X2)©@ (V1, 2) = (%1 t1,X2 +¥2). 
Complete the proof of Theorem 2. 
Let G be a group and let R,, and L, be the permutations of G 
defined in Corollary 3. Show that for u and w in G we have R, © 
Ry =Ryy and L, o Ly = Lyy,. Would the second condition hold 
if we had defined xL,, to be u @ x? 
Prove Theorem 4. 
Write Theorem 4 using additive notation. 
Suppose (G, @) is a group. Define a new binary operation * on G 
by x x y=y ex. Show that (G, *) is a group. 
Suppose (G, @) is a group and f:G—~X is a bijection. Define a 
binary operation * on X by 

x *y=((xf eof pf 
Prove that CX, *) is a group. 
Let G be a group in which x? = 1 for all x in G. Show that G is 
abelian. 


Let p be a prime and r a positive integer. Prove that 
o(p") =p"—*(p — 1). 


Show that an integer nm greater than | is a prime if and only if 

d(ny=n—1. 

Let (XY, @) be a semigroup. Assume that X is not empty and that 

for all uw and vy in X there exist x and y in X such that u @®x =p 

and y ®u =». Show that CX, ©) is a group. 

Let G be a nonempty set, let D be a subset of G X G, and let 

@ be a function from D to G. Suppose: 

(a) For all uw and v in G there exist x and y in G such that (u, x) 
and (y,u)areinDandu@x=y=yeu. 

(b) If x, y, and z are in G, then x © (y ®©z)=(x @ y) @©z provided 
all four products are defined, that is, if (x, y), (y, Z), , y@Z), 
and (x @ y, z) are allin D. 

Prove that D = G X G and (G, @) is a group. 

Verify that &G6 is a group table. (Compare Exercise 6.) 

Let P+2816 and G«+PLG6LAP;AP]]. Verify that G is a group 

table. (Compare Exercise 7.) 

Compute ¢(n) forn =2,..., 10. 


SUBGROUPS 83 


16 Prove that if G is a binary operation table for the finite group G, 
then G is a group table. 


17 Given that the matrix G is a group table, write a simple APL expres- 
sion for the identity element of G. 


18 How many of the elements in the group (124, G24) satisfy the 
condition x =! = x? How many elements in (160, G60) satisfy 
this condition? 


19 Determine the number of n-by-~ group tables for 1 <n < 3. 


3. SUBGROUPS 


In Section 2.2 we defined an additive subgroup of Z to be a nonempty 
subset of Z that contains the sum and difference of any two of its elements. 
It is now time to place this definition in its proper context. Let G be a 
group. A subgroup of G is a nonempty subset H of G such that whenever 
x and y are in H, then xy and x ™ are also in H. We often say that a sub- 
group H is closed under products and inverses. (If we are using additive 
notation, we say H is closed under sums and negatives.) A subset M of Z is 
clearly a subgroup of (Z, +) if and only if /M is an additive subgroup of Z 
in the sense of Section 2.2. Thus the subgroups of Z are the sets of the 
form nZ, n > O. The set {{[0], [2], [4]} of congruence classes modulo 6 is 
a subgroup of (Z,, +). 

Let x be an element of the set X and let H be the set of all permuta- 
tions of X that fix x. If f and g are in H, then so are f o g and f 7! and, 
therefore, H is a subgroup of D(X). We call A the stabilizer of x in X(X). 
For any group G the sets {1} and G are always subgroups of G. A proper 
subgroup of G is a subgroup different from G, and a nontrivial subgroup 
is a subgroup different from {1}. 


THEOREM 1. Let H be a subgroup of the group G. Then H is a group 
in its own right under the restriction to H of the binary operation on G. 
A subset K of H is a subgroup of G if and only if K isa subgroup of H. 


Proof. Since H is closed under products, the restriction of the binary 
operation on G to A is a binary operation on H. Since the associative law 
holds in all of G, it holds in H. Now Z is nonempty and so contains an 
element h. Therefore H contains 1 = hh, and so H has a two-sided iden- 
tity. Since H is closed under inverses, H is a group. The remainder of the 
proof is left as an exercise. [J 


Suppose Hf and K are subgroups of a group G. In general, the set HU K 
will not be a subgroup of G. For example, if we take G = Z, H = 2Z, and K = 


84 GROUPS 


3Z, then 2 and 3 are in H U K but 5 = 2 +3 is not. For intersections, how- 
ever, things are much nicer. 


THEOREM 2. Let G be a group. For each i in the nonempty set / 
let H; be a subgroup of G. Then 


K= () H; 
iel 


is a subgroup of G. 


Proof. Each of the subgroups H; contains the identity element 1 of G. 
Therefore 1 is in K and K is nonempty. Let x and y be elements of K. Then, 
for each i in J, the subgroup H; contains x and y and therefore also xy and 
x1. Thus xy and x are in K, and so K isasubgroupofG. O 


Let X be a subset of the group G. The set of subgroups H of G that 
contain X is not empty, since H = G is one such subgroup. By Theorem 2, 
the intersection K of all subgroups of G containing X is a subgroup of G. 
Clearly, K contains X and is the smallest subgroup of G that contains X. 
We denote K by <X> and refer to K as the subgroup of G generated by X. 
An interesting example is given by the case X = @. If K = <@>, then K is 
a subgroup, and so 1 is in K. But {1} isa subgroup of G which contains @. 
and thus K C {1}. Therefore K = {1}, and so @ generates the trivial sub- 
group of G. The next theorem gives a more concrete description of <X > 
when X #:¢. 


THEOREM 3. Let X be a nonempty subset of the group G. The sub- 
group <.X> is the set of all elements of G that can be expressed in the form 
X,...X,, where n > 1 and each x; is either in X or is the inverse of an ele- 
ment 1nX. 


Proof. Since subgroups are closed under products and inverses, any 
subgroup of G that contains X also contains every product x, ...xX, such 
that either x; or x;7'! is in X. Therefore the set H of all products of this 
form is in <X>. However, ifu=x,...x, andv=y,...y, are ind, 
then uy =X, 2... X41... ¥y, andut=x, 1 ...x,7! are also in H. Thus 
Hf is closed under products and inverses. Since H # @, H is a subgroup of 
G. Clearly, X C H, and so H D<.X>. Therefore H=<X>. U 

If X and Y are both subsets of G, then <.Y UY> is normally written 
<X, Y>. We say that G is finitely generated if there is a finite subset XY 
of G such that G = <.X>. If x is an element of G, we write <x> for <{x}>. 
If <x> is finite, then |}<x >| is called the order of x. A group generated by a 
single element is said to be cyclic. The group Z is cyclic since it is generated by 1. 
The groups Z, are generated by the congruence class [1] and so are cyclic 
too. If G = <x>, then G = {x™ |meZ} by Theorem 3. Since x™x” =x™*" = 
x"x™ Gis abelian. 


THEOREM 4. Let G bea finite cyclic group of order n generated by 


SUBGROUPS 85 


the element x. Then x” = 1 and n is the smallest positive integer with this 
property. Moreover, G = {l,x,...,x”!}. Ifx™ = 1, then n divides m. 
Proof. Since G is finite, the elements 1 =x°,x =x!, x?, x3,... cannot 
be distinct. Thus there exist integers r and s with r > 520 and x’ =x’. 
Choose such a pair with r as small as possible. Multiplying both sides of the 


equation x’ = x° by (x*)"* = x75, we obtain x’-5 = 1 = x°. By 
the minimality of r, this means s = 0, and so x” = 1. Also, the elements 
x°,..., x"! are distinct. Since G =<’x>, every element of G has the 


form x’ for some integer i. Let gq be the integral quotient when i is divided 
by r. Theni=j+qrand0O <j <r. Thus 


Therefore every element of G is in {1, x,...,x”~!}, and sor = |G| =n. If 
x™ = 1, then m must be divisible by n. O 

COROLLARY 5. Let x be an element of a finite group G. Then there 
exists a positive integer m such thatx =! =x™. 


Proof. Let n be the order of x, that is, the order of <x>. Then n > 1 
and x” = 1 by Theorem 5. Setting m = 2n — 1, we have m > O and 


xm =y2n-l =(x")*x-lh =x, 0 


Now we are able to show that for finite groups the condition that a 
subgroup be closed under inverses is superfluous. 


COROLLARY 6. Let H be a nonempty subset of the finite group G. 
Then H is a subgroup of G if and only if H is closed under products. 


Proof. If H is a subgroup of G, then A is closed under products. Sup- 
pose H is a nonempty subset of G that is closed under products. By Corol- 
lary 5, if x is in H, then there is an integer m > O such that x! =x™. 
Since H is closed under products, x™ is in H, and so A is a subgroup of 


G U 


Let G be an N-by-W group table and let A be a vector listing the ele- 
ments of a nonempty subset A of 1. By Corollary 6, A is a subgroup of 
(iW, G) if and only if GLALZI];AlL7J]J] is a component of A for all J and 
J in 1A. This is equivalent to the proposition \/, GLA;AJ]éA. 

Let us test some subsets of 16 to see if they are subgroups of (16, G6). 


G6 A/,G6LA;AJeEA<O0 2 
Oo1t123 4 5 1 
104 5 2 3 A/,G6LB;BjJeB<O 3 4 
23015 4 1 
325401 A/,G6LC;ClJeC+A,B 
4510 3 2 0 
543210 


86 GROUPS 


Here we see that {0, 2} and {0, 3, 4} are subgroups while, as we should 
expect, their union is not. 

As a further example, we will determine the subgroup of (124, G24) 
generated by the set A = {3, 8}. The calculation 


A/,G24LA;Al]ceA<3 8 
0 


shows that A is not closed under products, and so we form the union of 
A with the set of products of pairs of elements in A. 


LI<A<SSORT A,,G24LA;3A] 
3 4 7 8 12 16 

A/,G24ULA;A]eA 
@) 


We still do not have a subgroup, so we repeat the process. 


U<A<+SSORT A,.,G24LA;A] 
03 4 7 8 11 12 15 16 19 20 23 
A/,G24ULA;A]eA 
1 


The set listed in A is now a subgroup, and so <3, 8> = {0, 3, 4, 7, 8, 11, 
12, 15, 16, 19, 20, 23}. 

The procedure used in the preceding example is inefficient, since it 
computes many unnecessary products. A better algorithm is described in 
Exercise 13. This algorithm is used in the procedure GTSGP in CLASSLIB. 


GTINIT G6 
GTSGP 3 


GTINIT G24 
GTSGP 3 8 
03 4 7 8 11 12 15 16 19 20 23 


The procedure G7SGP takes a single argument that lists a set of elements 
in the current abstract group. The result is a list of the elements in the 
subgroup generated by the set. 

It would be interesting to know how many subgroups there are in 
the group (124, G24). However, as 124 has 274 or 16777216 subsets, it 
is clear that we will have to have a better way of finding out than just trying 
each subset in turn. To do this, we will need some more theory. It will turn 
out that a randomly chosen subset of a group is very unlikely to be a sub- 
group. 

Throughout the rest of this section we will assume that H is a subgroup 


SUBGROUPS 87 


of the group G. If x and y are in G, we say x is right congruent to y modulo 
Hifxy 7 isin H. 


THEOREM 7. Right congruence modulo 4A is an equivalence relation 
on G. 


Proof. For any x in G we know that xx = 1 is in H, and so right 
congruence is reflexive. If xy is in H, then yx 7 = (xy )7? is also in 
H, since H is closed under inverses. Therefore, if x is nght congruent to 
y, then y is right congruent to x. Finally, suppose x is right congruent to 
y and y is right congruent to z. Then xy 7 and yz! are both in H. Hence 
xz! = (xy 7!) (yz7!) is in A as A is closed under products. Thus x is right 
congruent to z, and we have an equivalence relation. [] 


The equivalence classes of right congruence modulo 4 are called the 
right cosets of H in G. The next theorem gives an alternate description of 
the right cosets of H. 


THEOREM 8. The right coset of H containing x is Hx, and there is a 
bijection from H to Ax. 


Proof. The set Hx is defined to be {hx|heH}. Since (hx)x =! = h, any 
element of Hx is right congruent to x modulo H. Suppose y is an element 
of G that is right congruent to x modulo H. Then h = yx“ is in H, and 
y = (yx 1) = hx is in Hx. Thus Hx is the right coset of H containing x. 
Now let f be the map from H to Ax that takes h to hx. The definition of 
Hx implies immediately that f is surjective, and the right cancellation law 
shows that fis injective. Therefore f is bijective. 0 

If the set of right cosets of H in G is finite, then the number of right 
cosets is called the index of H in G and is written |G:A|. 


THEOREM 9 (Lagrange). If G is finite, then |G| = |G:H| .X |A|. In 
particular, |H| divides |G]. 


Proof. We have |G:H| right cosets of H, each with |H| elements. Since 
the right cosets of H form a partition of G, the number of elements in G 
is just the sum of the number of elements in each right coset. [J 


Theorem 9 is named for the Italian-French mathematician Joseph- 
Louis Lagrange (1736-1813). 


COROLLARY 10. A group of prime order is cyclic. 


Proof. Let G be a group of prime order p. Choose an element x in 
G different from the identity. The subgroup <x> generated by x con- 
tains x and the identity element and has order dividing p. By the definition 
of a prime, |<x>| = p, and so <x> = G. Thus G is generated by any one of 
its nonidentity elements. {J 


88 GROUPS 


COROLLARY 11. If x is an element of the finite group G, then x!Gl = 
l. 
Proof. Let n be the order of x and let m = |G:<x>|. By Lagrange’s 
Theorem, |G| = mn, and so 
x IG = xmn= (x )™ =|" = l, 


since x” = 1 by Theorem 4. 


If we apply Corollary 11 to the group U,, we get an important result 
in number theory. 


COROLLARY 12. Let a and n be integers with n > 0. If gcd(a, n) = 
1, then a?) =1 (mod 7). 


Proof. Suppose gcd(a, n) = 1. Then the congruence class [a] is in 
U,, which has order ¢(n). By Corollary 11, we have [a]®”) = [1] or 
a?) =] (mod n). O 


Corollary 12 gives us a very powerful test for deciding whether a given 
integer is a prime. 


COROLLARY 13. Let p bea prime and let a be an integer not divisible 
by p. Then a?! =1 (mod p). 


Proof. Since p is a prime, ¢(p)=p—1. O 


Suppose we fix a small integer a greater than 1, say a = 2 ora =3. 
Given a large integer n, it is easy to see whether or not gcd(a, n) = 1 using 
the. Euclidean algorithm. If gcd(a, n) # 1, then n is certainly not a prime. 
If gcd(a, n) = 1, then we can use the binary power algorithm described in 
Section 1 to compute the remainder of a”~! modulo n. If a”~! is not 
congruent to | modulo yn, then n is not a prime. If a”~ 1= 1 (mod n), then we 
say that n is a pseudoprime relative to a. In this case the chances are very 
good that n is a prime. One way to prove that n is a prime is to show that U,, 
has order n — 1. The exercises describe an approach to this problem that 
assumes that the prime factors of nm — 1 are known. 

Having seen the definitions of right congruence and right cosets, one 
can easily construct the definitions of left congruence and left cosets. Two 
elements x and y of G are left congruent modulo H if xy is in H. Left 
congruence is readily seen to be an equivalence relation on G, and the equiv- 
alence classes are the left cosets of H. The analogue of Theorem 8 holds so 
that the left coset containing x is xH. If G is finite, then the number of 
elements in a right coset of H is the same as the number in a left coset, 
that is, |Z]. Thus the number of right and left cosets must be the same. 
Even if G is infinite, there is still the “same number” of right and left cosets 
in the sense that there is a bijection from the set of right cosets to the set 
of left cosets. (See Exercise 22.) 


SUBGROUPS 89 


Let us consider an example in which G is the group (124, .G24). The 
cyclic subgroup H generated by 9 has order 4. 


GTINIT G24 
U<H<GTSGP 9 
09 16 18 


Right and left congruence modulo A are equivalence relations on 124. How 
can we construct the characteristic matrices for these equivalence relations? 
The inverse of J in G is TNV24LI]. Two elements J and J of 124 are 
right congruent modulo AZ if and only if G24LZ;Z7NV24LJ7]] isin H. Thus 
the characteristic matrix for right congruence modulo A is 


R<G24UL3;INV24 JeH 
Similarly, the characteristic matrix for left congruence is 
L<G24LINV24; Jed 
We can check directly that right and left congruence are equivalence rela- 
tions. 
SEQREL R SEQREL L 
1 1 


We can also verify that R and Z represent different relations. 


A/,R=L 


The procedures GZTRCON and GTLCON construct the characteristic 
matrices for right and left congruence, respectively, modulo a subgroup 
of the current abstract group. We could have obtained RF and LZ by 


R<+GTRCON F 
L<+GTLCON #H 


EXERCISES 


1 Give an example of a nonempty subset of Z that is closed under 
sums but is not a subgroup of Z. 

2 Suppose H is a nonempty subset of a group G such that for all 
x and y in A the product xy ™ is in H. Show that A is a subgroup 
of G. 

3 Let K be the set of rational numbers m/n with m and n integers 
and n a power of 3. Prove that K is a subgroup of (Q, +). 


90 


GROUPS 


4 Which of the following sets are subgroups of (Q — {0}, xX)? 


12 


13 


14 


15 


16. 


17 


18 


19 


(a) The set of positive rational numbers. 

(b) The set of negative rational numbers. 

(c) {1, —I}. 

(d) The set of rational numbers x with x > 1. 

Complete the proof of Theorem 1. 

Let x be an element of a group and let M = {meZ|x™ = 1}. Show 
that M is a subgroup of Z. 

Let L be the set of all elements in 2, that map the set {0, 1, 2} 
into itself. Prove that L is a subgroup of 2, and determine |Z]. 
Let X be a subset of the group G and let C be the set of elements 
g in G such that xg = gx for all x in XY. Show that C is a subgroup 
of G. (We call C the centralizer of X in G.) 

Let G be an N-by-WN group table and let X be a vector listing a sub- 
set of 1N. Write an APL expression for the characteristic vector 
of the centralizer of X in (1N, G). Determine the centralizer of 27 
in (1 60, G60). 

Let H and K be subgroups of the abelian group G. Show that 
HK is a subgroup of G. 

Let r and s be integers. Prove that (rZ) + (sZ) =dZ and (rZ) 
(sZ) = mZ, where d = gcd(r, s) and m = Icm/(, s). 

Let X be a nonempty subset of the finite group G. Show that 
<Y> is the smallest subset H of G such that X CA and HX CH. 
Let G and X be as in Exercise 12. Show that the following algo- 
rithm terminates with H =<Y>. 

(a) Set Hand Y equal to {1}. 

(b) Set Y equal to YX — ZH. 

(c) If Y=@Q, stop. 

(d) Set H equal to H U Y and go to step (b). 

Suppose x 1s an element of a group and x has order 40. What are 
the orders of x?, x*, and x78? 

Find all subgroups of (16, G6). 

Find the cyclic subgroups of the groups (124, G24) and (160, 
G60). How many elements of each order do these groups have? 
Show that (124, G24) and (160, G60) can each be generated by 
two elements. 

In the text we found a subgroup A of (124, G24) with order 12. 
Construct a group table for A. 


Show that the group A in Exercise 18 has no subgroup of order 6. 


20 


21 


22 


23 


24 


25 


26 


27 
28 


29 


30 


*31 


SUBGROUPS _ 91 


Show that the set of elements x in the group A of Exercise 18 
that satisfy x* = 1 forms a subgroup of A. 

Let n be a positive integer. Show that the cosets of nZ in Z are 
the congruence classes modulo nv and thus that |Z:n Z| =n. 

Let H be a subgroup of the group G. Show that two elements 
x and y of G are right congruent modulo Z if and only if x7! 
and y"! are left congruent modulo H. Use this to show that there 


is a byection from the set of right cosets of H onto the set of left 
cosets of 7. 


Show that any group of order 4 is abelian. (Hint. Either there is 
an element of order 4 or Exercise 2.8 applies.) 


Let p be a prime and let a be any integer. Prove that a? =a(mod p). 
Let x be an element of a group and let n be a positive integer. Sup- 
pose x” = Il, but x"/P + |] for all primes p dividing n. Show that 
x has order n. 

Let x be an element of a group and suppose an integer n is known 
such that x” = 1. Assuming the prime factors of m are given, de- 
scribe an efficient procedure for determining the order of x. 

Find the order of [2] in U4. 


Given that 2979909 = 1] (mod 1000001), determine the order of [2] 
in Ujoo0001- (The procedure ZVPOWER described in Exercise 1.25 
will be useful here.) 


Let n be an integer greater than 1. Assume we have a sequence 
ai, ... , a Of integers relatively prime to m and we know the 
order m; of [a;] in U,. Suppose lcm(m,, ...,m,) =n — 1. Prove 
that 7 is a prime. 

Construct an approach based on Exercises 26 and 29 and the re- 
marks following Corollary 13 for deciding whether or not an 
integer n is a prime, assuming that the prime factors of nm — 1 can 
be obtained. 


On most APL terminal systems, computation in U, with n > 10° 
requires multiple-precision calculations. The procedure ZVPOWER 
uses double precision when necessary to produce correct results, 
even when W*2 exceeds the usual precision of the system. Use 
ZNPOWER to determine which of the following integers are pseudo- 
primes relative to 2. 

(a) 999999937. 

(b) 159890287921. 

(c) 1099511627689. 


92 GROUPS 


(These computations may require significant amounts of CPU 
time.) 

*32 Show that p = 1000037 is a prime. From this it follows that U, has 
order p — 1. Show that the class [2] has order p — 1 in U, and 
therefore is a generator of U,. This means that there is an integer 
m such that [2]” = [3] or, equivalently, 2” =3 (mod p). Find 
one such m, 

33 Let G be a finite group. Prove that there is a subset X of G such 
that G = <X> and [X| < log, |GI. 
34 Is(Q, +) finitely generated? 


In Exercises 35 to 37 assume that the following statements have been ex- 
ecuted as in the last example in the text. 


GTINIT G24u 
H<GTSGP 9 

R«+GTRCON # 
L<GTLCON # 


35. Let REP<SSORT SFEL R. Explain why F#EP lists a set of repre- 
sentatives for the right cosets of H in (124, G24). Verify that 
(OREP)xpd is 24, in agreement with Lagrange’s Theorem. 

36 Write an APL expression for Z in terms of R and JNV24. (Hint. 
See Exercise 22.) 


37 Construct the characteristic vector for the set N of integers J in 
124 such that RLI;] = DLT; J. Check that N is a subgroup of 
(124, G24). Is this an accident? 


4. HOMOMORPHISMS 


In this section we will study maps from one group to another. We will not 
be interested in just any maps but only in those that are ‘‘compatible’’ with 
the binary operations on the two groups. Let (G, @) and (H, «) be groups 
and let h be a function from G to H. What does it mean for h to be com- 
patible with @ and «x? We say that h is a homomorphism from G to A if, 
for all x and y in G, we have (x @ y)h = (xh) * (yh). Informally, we say that 
under a homomorphism the image of a product is the product of images. 
(Of course, with additive notation we would say the image of a sum is the 
sum of the images.) 

Let us look at some examples of group homomorphisms. Let G be any 
group and let x. be an element of G. The first law of exponents in Theorem 
2.4 states that the map that takes an integer m to x™ is a homomorphism 
from Z to G. Here we are writing the binary operation in Z additively and 


HOMOMORPHISMS 93 


the operation in G multiplicatively. The image of Z under this homomorph- 
ism is the cyclic subgroup of G generated by x. A particular case of this 
homomorphism is the map from Z onto Z, taking m to the congruence 
class [m]. 

Let G and A be any two groups. The trivial homomorphism from G 
to H is the map that sends every element of G to the identity element of 
H. The identity function on G is a homomorphism from G to itself. If G 
is abelian, then for any x and y in G we have (xy)? =y™' 
so the inverse map is a homomorphism from G to G. 

Let G and 7 be M-by-M and N-by-V group tables, respectively. A homo- 
morphism from (1M, G) to (1M, H) is a vector F of length M with com- 
ponents in 1 such that for all J and J the value of FLGLI;J]] is equal 
to HLFLI1:FLI]] or, equivalently, such that the matrices FLGJ] and 
ALF ;F] are the same. The workspace EXAMPLES contains a vector H24706. 
Let us rename this vector F and show that F is a homomorphism of 
(124, G24) onto (16, G6). 


-l —y-1ly-! 
x =X V , 


oF<H24UTO6 A/,FLG24J=G6LF;F] 
24 1 

A/Fe16 SSORT F 
1 0142 3 4 5 


Not only does F map products to products, it also takes the identity ele- 
ment of G24 to the identity element of G6. 


FLO] 
0 


and it maps inverses to inverses. 


A/FCINV24 J=INVELF] 
1 


This is true for any homomorphism. 


THEOREM 1. Let h be a homomorphism from the group (G, @) to 
the group (H, x). Then h takes the identity element of G to the identity 
element of H and for all x in G we have (xh)! =x "h. If K isa subgroup 
of G, then Kh is a subgroup of H, and if L is a subgroup of H, then Lh™ is 
a subgroup of G. 

Proof. Let e be the identity of G and let f be the identity of H. Then 

(eh) x f=eh=(e@e)h=(eh) x (eh), 


and so, by the left cancellation law, f = eh. Suppose x is any element of G. 
Then 


94 GROUPS 


f=eh=(x ex !)h={xh) * (x th) 
and, by the uniqueness of inverses in H, it follows that (xh) =x 'h. Now 
let K be a subgroup of G. Then Kh is nonempty, since K is nonempty. Sup- 
pose uw and y are in Kh. There exist elements x and y of G such that u = xh 
and vy = yh. Therefore 
uxv=(xh)x(yh)=(xe@ y)h 
and 


u~! =(xh)! =x th. 
Since x @ y and x7! are both in K, we see that Kh is closed under products 
and inverses and so is a subgroup of H. We leave as an exercise the proof that 
the inverse image of a subgroup of HisasubgroupofG. U 


Homomorphisms can be composed to form new homomorphisms. 


THEOREM 2. Let f:G—d and g:H4——K be homomorphisms of groups. 
Then f o g isa homomorphism from G to K. 


Proof. See Exercise 4. [L 


The following table defines a group with {0,1} as its set of elements. 


+10 ] 
0;0 1 
1; 1 O 


describes the group ({1,—1}, X). Although these groups are distinct, they are 
very closely related. If each O in the table for the first group is replaced by 
a 1 and each 1 replaced by a —1, we get the table for the second group. 
That is, the second group can be obtained by ‘‘renaming”’ the elements of 
the first group. Two groups related in this way are said to be isomorphic. 
The concept of isomorphic algebraic structures is a very important idea. 

We formalize the notion of isomorphic groups in the following way. 
Let G and H be groups. An isomorphism of G onto H is a bijective homo- 
morphism from G to H. We say G is isomorphic to H if there exists an 
isomorphism of G onto H. For example, the function that takes each real 
number x to 2” is an isomorphism of (R,+) onto (R’,.X), where R* de- 
notes the set of positive real numbers. Also, if m is a nonzero integer, then 
the map mb -nm is an isomorphism of Z onto its subgroup nZ. If G is 
isomorphic to H, we write G = H. 


HOMOMORPHISMS 95 


THEOREM 3. Let f:G—-H and g:H—kK be isomorphisms of groups. 
Then f7! and f o g are isomorphisms. 


Proof. Theorem 2, along with Theorem 1.3.2, implies immediately 
that f o g is an isomorphism. To show that f7! is an isomorphism, we 
need only show that it is a homomorphism, since the inverse of a bijection 
is always a bijection. Let u and v be in A and set x = uf! and y = rf". 
Then (xy) f= uy, and so (uv) f! =xy = (uf) (Qf). O 

COROLLARY 4. Let G, H, and K be groups. Then 

(a) G=G. 

(b) If G =A, then H =G. 

(c) IfG =H and H=K, thenG = K. 


Proof. The identity function on G is an isomorphism of G onto G. 
Thus G is isomorphic to itself. Suppose f:G—-H is an isomorphism. Then, 
by Theorem 3, f™ is an isomorphism from H to G, so H is isomorphic to 
G. Part (c) is equally trivial. 0 


In view of Corollary 4 it is tempting to call = an equivalence relation 
on the set of all groups. However, the set of all groups is one of those sets 
that is too big in the sense that its consideration can lead to logical contra- 
dictions. Nevertheless, the analogy with equivalence relations is useful. 

Two groups that are isomorphic are often considered to be the same. 
In the 1—1 correspondence preserving products between the elements of 
the groups, subgroups correspond to subgroups and cosets correspond to 
cosets. In fact, their entire structures as groups are identical. For example, 
let X = {x} be a set with one element. There is exactly one binary oper- 
ation @ on X, the one in which x @x is x. The pair (XY,@) is a group. Clearly, 
any two such groups are isomorphic. Another way of saying this is that up 
to isomorphism there is only one group of order 1. Very often the phrase 
“‘up to isomorphism’ is omitted in such statements. 

Let G be a finite group of order NW and let 7 be the group table for 
G relative to some numbering x9, ...,Xj_, of the elements of G. We have 
already remarked that (14,7) is a group. The map taking J to x7 is an 
isomorphism of (14,7) onto G. 


EXERCISES 
1 Let G be an abelian group and let n be an integer. Show that the 
map taking x to x” is ahomomorphism of G into itself. 


2 Listed here are pairs consisting of a group G and amap f of G into 
G. In each case determine whether fis a homomorphism. 


(a) G=Z, f:x 3x. 


96 


14 


15 


16 


GROUPS 


(b) G=(Q,4), fix Fex?. 

(c) G=(Q—({0}, X), fix Fex?. 

(d) G=(R‘*,X), where R* = {xeR|x > 0}, fix FeV. 
Complete the proof of Theorem 1. 

Prove Theorem 2. 


Let (G,@) be a group and let x be a binary operation on a set 4H. 
Suppose there is a map f of G onto A such that (x @ y)f = (xf) x 
(yf) for all x and y in G. Prove that (H,x) is a group. 

Let f-G—H be a surjective homomorphism of groups. Show that: 
(a) If G is abelian, then 7 is abelian. 

(b) If G=<x,,...,x,>, then H=<x,f,...,x;,f>. 

Let f:X—~Y be a bijection. Prove that 2(X) and 2(Y) are iso- 
morphic. 

Exhibit an injective homomorphism of Z, into Zg and a surjective 
homomorphism of Zs, onto Z,. 


Let 7 be a group table for a finite group G of order VV. Show that 
(1N,7) is isomorphic to G. Prove that a matrix S is a group table 
for G if and only if there is a permutation P of 1N such that S 
is PLTLAP;AP]]. 

Show that up to isomorphism there is only one group of order 2 
and one group of order 3. 


Exhibit two nonisomorphic groups of order 4. 

Give an upper bound for the number of groups of order up to 

isomorphism. 

Let G be a finite group and let f:G—-H be a group homomorphism. 

Assume that K and L are subgroups of G with K C L. Show: 

(a) KfC Lfand |Lf: Kf| <|L:K |. 

(b) If x isin G, then the order of xf is less than or equal to the 
order of x. 

Show that (16,G6) is isomorphic to 23. (A group table for 2, 

was constructed in Exercise 1.20.) 

Let G2<2|(12)°.+12. Construct homomorphisms of (16, G6) 

and (124,G24)onto (12,G2). 


In the text we noted that the group of real numbers under addition 
is isomorphic to the group of positive real numbers under multi- 
plication. Can the same thing be said about the rationals? That is, 
are (Q,+) and (Q*,X) isomorphic, where Q* = {x e€ Q |x > 0}? 


NORMAL SUBGROUPS 97 


17 The workspace EXAMPLES contains vectors Fi, F2, and F3 
of length 24, each of which describes a map from 124 to 160. 
Which of these maps are homomorphisms from (124,G24) to 
(160, G60)? 


18 Execute the following APL statements. 


GTINIT G24 
K<+GTSGP 3 8 
F<H24TO6 


We know that K is a subgroup of (124,G24) of order 12 and 
that F is a homomorphism of (124,G24) onto (16, G6). 
Compute a vector KF listing the elements of the image of K under 
fF’, Show directly that KF is a subgroup of (16,G6). Now execute 


GTINIT G6 
L<GTSGP 1 


Construct a vector LFTJ listing the elements of the inverse image 
of Z under F. Show that LF I is a subgroup of (1 24, G24). 


5. NORMAL SUBGROUPS 


In the last section we observed that the vector F«.123706 is a homomor- 
phism from (124, G24). onto (16, G6). Let us find the set K of elements 
in 124 mapped to the identity element O of (16, G6) by F. 


U<+K<+(F=0)/124 
0 7 16 23 


Since {0} is a subgroup of (16', G6), Theorem 4.1 tells us that K is a sub- 
group of (124, G24). We can also check this directly. 


A/,G24U~LK;KleK 
1 


As we will see, K has an important property not possessed by all subgroups 
of (124, G24). Let us define three logical matrices. 


GTINIT G24 L<+GTLCON K 
R<«GTRCON K P<+PFo,.=F 


The matrices R and Z are the characteristic matrices for right and left con- 
gruence, respectively, modulo K and PL J;J7] is 1 if and only if F maps 


98 GROUPS 


I and J to the same element of 16. It turns out that all three of these 
matrices are the same. 


A/,R=L A/,R=P 
1 1 


This section is devoted to showing why this is not a coincidence. 

Let H be a subgroup of a group G. We say that H is normal in G and 
write H < G if for all x in G we have x =! Hx C H, where, as usual, x! Hx 
means {x !yx|y e H}. It follows immediately from the definition that 
{1} and G are normal subgroups of G. There are many conditions that are 
equivalent to normality. Some of these conditions are given in the next 
theorem. 


THEOREM 1. Let A be a subgroup of a group G. Then the following 
are equivalent. 

(a) AH is normal in G. 

(b) If x isin G, then x =! Hx =H. 

(c) Right and left congruence modulo A are the same. 

(d) Every right coset of A is also a left coset of H. 

(e) The product of two right cosets of A is a right coset of H. 


Proof. As is usual in proofs of theorems of this type, we will show that 
each of the first four conditions implies the next and that (e) implies (a). 

(a) implies (b). Suppose H <a G and x isin G. Then 

xHx =(x1)71 Ax! 
is contained in AH, so 
H = (x !x)A(x!x)=x 7) (xx )x Cx Ax CH. 
Thus x -! Hx = H. 

(b) implies (c). Assume (b) and let x and y be in G. If yx ™ isin H, then 
x (yx! )x = xy is in x“! Hx = H. Therefore right congruence modulo 
H implies left congruence modulo H. The reverse implication is equally 
trivial. 

(c) implies (d). Let x be in G. Then the right coset Hx is the set of 
elements of G right congruent to x modulo H, while the left coset xH is 
the set of elements left congruent to x. If right and left congruence are the 
same, then Hx = xH. 

(d) implies (e). Assume (d) and consider a product (Hx) (Hy) of two 
right cosets of H. By (d), Hx is a left coset of H and, since x is in Hx, we 
must have Hx = xH. Therefore 


(Hx) (Hy) = xHHy = xHy = Hxy. 
(e) implies (a). Assume (e) and let x be in G. Then (Hx!) (Hx) isa 


NORMAL SUBGROUPS 99 


tight coset and, since it contains 1x~!lx = 1, we have (Hx!) (Hx) = Z. 
But Hx ~! Hx contains x Hx, and sox !Hx CH. ThereforeH<acG. 0 


COROLLARY 2. If AH is a subgroup of G and |G:H| = 2, then H AG. 


Proof. Since |G:H| = 2, there are exactly two right cosets of H in G. 
By Exercise 3.22, there are exactly two left cosets of H in G. Let x be in 
G — H. Then the right cosets of H are H and Ax and the left cosets are 
H and xH. Thus Hx = G — H = XH, and every right coset of A is also a left 
coset. Thus H <G by Theorem 1. 0 


In the example at the beginning of this section we verified that the 
subgroup K of (124, G24) satisfies condition (c) of Theorem 1. Therefore 
K is a normal subgroup. In Section 3 we found that right and left congru- 
ence in (124, G24) modulo the cyclic subgroup generated by 9 are not 
the same. Thus not every subgroup of (124, G24) is normal. 

Suppose N is a normal subgroup of a group G. By Theorem 1, we do 
not need to distinguish between left and right cosets of N. We will let G/N 
denote the set of cosets of N in G. If H is a group and f:G—A is a homo- 
morphism, then {1}f7!, the inverse image of the trivial subgroup of H, is 
called the kernel of f. In the preceding example, X is the kernel of #’. 


THEOREM 3. Let f:G—H be a homomorphism of groups. Then the 
kernel N of f is a normal subgroup of G, and two elements of G are mapped 
to the same element of H if and only if they are in the same coset of N. If 
LH, then Lf! aG If K AG, then Kf <Gf. 

Proof. By Theorem 4.2, we know that N = {1}f7! is a subgroup of 
G. Suppose that x isin N and uw is in G. Then 


(u“txu)f=(u*f) (xf) (uf) = (uf) Iuf) = 1, 
sou !xu isin N. Therefore N 4G. 

Next let x and y be any two elements of G. Then xf = yf if and only 
if (xf)? (yf) = 1. However, (xf)! (yf) = (x7'y)f and (x "y)f= 1 if 
and only if x !y is in N. Therefore xf = yf if and only if x and y are in the 
same coset of N. 


Now suppose L <\H and let x be in Lf and y in G. Then 
(YI xWFHEOfIT Of) Ofle OATLON)=L, 


soy ixy isin Lf-!. Therefore Lf“! < G. Finally, suppose K <1. G and let u 
be in K and vy in G. Then 


(vf) (uf) Wf) = Tt ur)f € KF, 


and, since uf is a typical element of Kf and yf is a typical element of Gf, it 
follows that KfaGf. O 


100 GROUPS 


COROLLARY 4. In Theorem 3 the homomorphism f is injective if and 
only if the kernel N is trivial. 


Proof. By Theorem 3 the map f is injective if and only if all the cosets 
of N have exactly one element and, by Theorem 3.8, this holds if and 
only if N= {1}. O 


It is important to note that if f:G—-dZ is a group homomorphism and 
K 3G, then Kf is not generally normal in H but only normal in Gf. 

Let N be a normal subgroup of the group G. In view of Theorem 3 
it is natural to ask whether there exists a group H and a homomorphism 
f:G—H such that N is the kernel of f, The answer is that we can find such a 
pair H and f. In fact, we can take the elements of H to be the cosets of 
N. By Theorem 1, the product of any two cosets of N is again a coset of 
N. Thus the set G/N has a binary operation defined on it. 


THEOREM 5. Let N <1G. The set G/N of cosets of N is a group under 
the usual multiplication of subsets of G. The natural map for G to G/N 
is a homomorphism of G onto G/N with kernel N. 


Proof. By Theorem 1.2, multiplication of subsets of G is associative 
on all of 2° and therefore is associative on G/N. In the proof of Theorem 
1 we showed that for any x and y in G we have (Nx) (Ny) = Nxy. Taking 
x to be 1, we see that N1 = WN is a left identity element in G/N, and taking 
x to be y!, we see that Ny is a left inverse for Ny relative to N. By 
Theorem 2.5, G/N is a group. The natural map from G to G/N is the map 
f that takes an element x of G to the coset Nx. Clearly, f is surjective. The 
formula (Nx) (Ny) = Nxy shows that f is a homomorphism. Suppose x 
is in the kernel of f. Then N = xf = Nx. This implies x is in N. Every element 
of N is in the kernel, and so the kernel of fisN. 0 


The group G/N is called the quotient group or the factor group of 
GbyN. 

THEOREM 6. The quotient group of Z by nZ is Z,,n = 1. 

Proof. Any subgroup of an abelian group is normal. (See Exercise 1.) 
We defined the binary operation + on the set Z, of congruence class modulo 


n, which is the same as the set of cosets of nZ in Z, such that [x] + [y] = 
[x +y]. In the notation of cosets this is 


(x +nZ)+(y +nZ)= (x +y)+nZ, 
and this is precisely the binary operation in the quotient group Z/nZ. 0 


THEOREM 7 (First Isomorphism Theorem). Let f:G—H be a surjective 
homomorphism of groups and let N be the kernel of f. Then A is isomor- 
phic to the quotient group G/N. 


NORMAL SUBGROUPS 101 


Proof. Let g:G—>G/N be the natural map. We would like to define a 
map h from G/N to A such that the diagram commutes in the sense that the 


h 


H 


composition g o h is equal to f. There is at most one such map h since, if 
x is in G, then 


xf=x(g oh) =(xg)h =(Nx)h, 


so h must map Nx to xf. But is the h so described well defined? That is, 
is the subset 


h= {(Nx,xf)|x eG} . 


a function from G/N to H? Certainly h is a relation from G/N to H. Suppose 
Nx = Ny. Then, by Theorem 3, xf = yf and the coset Nx is the first com- 
ponent of exactly one ordered pair in h. Therefore h is a function and f = 
g oh. All that remains is to show that h is an isomorphism. If x and y are 
in G, then the product of (Nx)h and (Ny)h is (xf) (vf) = (xy )f = (Nxy Dh 
and Nxy is (Nx) (Ny). Therefore h is a homomorphism. Since g o hh is 
surjective, we know h is surjective by Theorem 1.3.2. Suppose Nx is in the 
kernel of h. Then xf = (Nx)h = 1, so x is in N. Hence the kernel of A is 
trivial and, by Corollary 4, h is injective. Thus h is a bijective homomor- 
phism, that is, an isomorphism. U 


Theorem 7 allows us to describe all cyclic groups up to isomorphism. 


THEOREM 8. Any infinite cyclic group is isomorphic to Z. Any finite 
cyclic group is isomorphic to Z, for a unique n > 0. 


Proof. Let G = <x> be a cyclic group. The map f taking m in Z to 
x™ is a homomorphism from Z onto G, and so by Theorem 7, G is iso- 
morphic to the quotient group of Z by the kernel of f The kernel of f is 
nZ for a unique nonnegative integer n. If n > O, then nZ has n cosets. There- 
fore, if G is infinite, then m = 0 and G = Z/ {0}. We leave as an exercise the 
verification that Z/ {0} is isomorphic to Z. (See Exercise 2.) If Gis finite, then 
n>OQOandG =Z/nZ=Z, by Theorem 6. 0 


The next theorem may be viewed as an addition to Theorem 4.1. 


THEOREM 9. Let G and dH be groups and let f:G——H be a surjective 
homomorphism with kernel N. Then there is a bijection from the set of 
subgroups of G containing N onto the set of subgroups of H. If a subgroup 


102 GROUPS 


K of G containing N is mapped to a subgroup L of H, then K <i G if and 
only if La dz. 


Proof. Let S be the set of subgroups of G containing N and let T be the 
set of subgroups of H. Given K in S, the image Kf of K under f is in T. Also, 
given L in T, the inverse image Lf is a subgroup of G containing {1}f7? = 
N, and so Lf~! is in S. We will show that the maps KF>Kf and LF-Lf7 
are inverses of each other. If K is in S, then (Kf)f7=! DK. But if x isin 
(Kf)f-, then xf is in Kf and xf = yf for some y in K. By Theorem 3, x is 
in Ny, which is contained in K, and so x is in K. Therefore (Kf)f™ = K. 
Now suppose L is in T. By the definition of Lf-!, (Lf )f C L. Given u in 
L, there exists x in G with xf =u, since f is surjective. Thus x is in Lf~ ‘and 
so (Lf-!)f = L. By Corollary 1.3.4, the maps K/->Kf and LF-Lf™ are 
bijections and are inverses of each other. The final part of the theorem 
follows from Theorem 3. 0 


The bijection in Theorem 9 preserves indices in the following sense. 


COROLLARY 10. In Theorem 9, if K is a subgroup of G containing 
N, then there is a 1—1 correspondence between the right cosets of K in 
G and the right cosets of Kf in H. In particular, if G is finite, then |G:K| = 
|A:Kf]. 

Proof. For any x in G we have (Kx)f = (Kf) (xf) and, therefore, the 
image under f of the coset Kx of K is a coset of Kf. Thus Kx F>(Kf) (xf) 
defines a map g from the set of right cosets of K in G to the set of right 
cosets of Kf in H. Clearly, g is surjective. Suppose (Kf) (xf) = (Kf) (f). 
Then (xf) (vf) = (xy 7!)f is in Kf and, since we showed in the proof of 
Theorem 9 that (Kf)f7! = K, this means that xy e K or Kx = Ky. There- 
fore g is injective too. 0 


We have already determined all of the subgroups of Z. Using Theorem 
9, we can find the subgroups of Z, forn > 0. 


COROLLARY 11. Let G be a finite cyclic group of order n. Then 
every subgroup of G is cyclic and G contains exactly one subgroup of order 
m for each positive integer dividing n. 


Proof. By Theorem 8 we may assume G = Z,. Let f be the natural 
map of Z onto Z,. By Theorem 9 there is a 1—1 correspondence between 
the subgroups of Z, and the subgroups of Z containing the kernel nZ of 
f. Any subgroup of Z has the form dZ for some d > O and nZ C dZ if and 
only if d divides n. Suppose d does divide n. Then the image of dZ in Z,, is 
the cyclic subgroup generated by df = [d]. Let m = n/d. Then m[{d] = 
[md] = [n] = [0]. If O0<r<m, then 0 <rd <n and r{d] # [0]. There- 
fore [d] has order m in Z,,. For each divisor m of n the value d = n/m is 
uniquely determined. Thus Z, has exactly one subgroup of orderm. U 


NORMAL SUBGROUPS 103 


Theorem 7 is called the First Isomorphism Theorem. There are two 
additional isomorphism theorems, which we state for completeness. 

THEOREM 12 (Second Isomorphism Theorem). Let N be a normal 
subgroup of a group G and let K be a subgroup of G. Then 

(a) NK is a subgroup of G and N is a normal subgroup of NK. 

(b) NO K is anormal subgroup of K. 

(c) The quotient groups (NK)/N and K/(N M K) are isomorphic. 

Proof. See Exercise 15. O 


THEOREM 13 (Third Isomorphism Theorem). Let H and K be normal 
subgroups of a group G with H C K. Then 


G/K = (G/H) | (K/H). 
Proof. See Exercise 16. 


EXERCISES 


1 Prove that every subgroup of an abelian group is normal. 


2 Show that for any group G the subgroups {1} and G are normal 
in G and G/ {1} is isomorphic to G. 


3 Prove that the intersection of any nonempty collection of normal 
subgroups of a group G is normal in G. 


4 Let H be a subgroup of the group G and let N = {x e G|Hx =xH}. 
Show that N is a subgroup of G containing H and H <N. We call 
N the normalizer of H in G. 


5 Suppose that H is a subgroup of G, N is the normalizer of H inG, 
and C is the centralizer of H in G. (See Exercise 3.8). Show that 
C is a subgroup of N and CIN. 


6 Let f:G——-H be a homomorphism of groups and let x e G. Show 
that if x has finite order, then the order of xf divides the order 
of x. 


7 Give an example of a group G, a normal subgroup K of G, and a 
homomorphism f:G—~H such that Kf is not normal in H. 

8 Letn be a positive integer. Show that the order of the congruence 
class [m] in Z,, isn/gcd(m,n). 

9 Let x and y be commuting elements of a group G and suppose that 
x, y, and xy have orders m,n, and r, respectively. Set d = gcd(m,n). 


Show that r divides mn and that mn/d? divides r. Thus if d = 1, 
then r = mn. 


104 


10 


*11 


12 


13 


14 


15 


16 


GROUPS 


Let x be an element of a group such that x !° has order 6. What is 
the order of x? Is the answer unique? 


Let G be an N-by-N group table and let 7 be a vector listing the 
elements of a subgroup of (1.¥, G). Write an APL proposition cor- 
responding to each of the conditions (a) to (e) of Theorem 1. You 
may assume that the vector INV gives the inverses of the group 
elements. 


With G and 4 as in Exercise 11, write an APL expression defining 
a vector that lists the elements of the normalizer of Z in (1M, G). 


Construct several subgroups of (124, G24) and (160, G60) as 
follows. Begin with a cyclic subgroup and construct the centralizer 
and normalizer of this subgroup. Then compute the centralizers 
and normalizers of these groups, continuing until no new groups 
are found. Repeat the process with another cyclic subgroup. (See 
Exercises 12 and 3.9.) 


Let K be a vector listing the elements of a subgroup of the current 
abstract group G and let R<«GTRCON K. In Exercise 3.35 we saw 
that REP<SSORT SFEL R is a vector listing right coset representa- 
tives for K in G. Show how to construct a vector COSET such that 
COSETLI] is the number of the coset of K containing J, that 
is, J is right congruent to REPLCOSETLI]] modulo kK. Assuming 
K is normal in G, write an APL expression defining a group table 
for the quotient group of G by K. 


Let N, G, and K be as in Theorem 12 and let f:G-—~G/N be the 
natural map. Show that NK = (Kf)f7!, and so NK is a subgroup 
of G. Show that the kernels of the restrictions of f to NK and 
K are N and Nf K, respectively. Complete the proof of Theorem 
12 by using Theorem 7 and the observation that (NK)f = Kf. 


Let H, K, and G be as in Theorem 13 and let f:G——~G/H be the 
natural map. Prove that Kf = K/H and K/H <G/H. Let g:G/H— 
(G/H)/(K/H) be the natural map. Prove that f o g maps G onto 
(G/H)/(K/H) with kernel K. Complete the proof of Theorem 13. 


6. DIRECT PRODUCTS 


Given a group G, we can construct new groups by forming subgroups and 
quotient groups of G. In this section we will study a method for obtaining 
a new group from two given groups. 

Let (G, @) and (H, x) be groups. We can define a binary operation 
® on the Cartesian product G X A by setting 


DIRECT PRODUCTS 105 


(X1,V1) @ (%2, V2) = (%1 @ XQ,V, * V2) 


for allx,,x, in Gandally,,y, in A. Since both 


(X1,V1) @ [(%2, 2) @ (%3,93)] 


and 


[(%1,V1) © (X%2,V2)] ® (%3, 3) 


are equal to (x, @©xX, @xX3,V; * Vy * 3), the operation ® is associative. 
The pair (1, 1) is a two-sided identity element and (x |, y ) is a two-sided 
inverse for (x, y). Thus (G X H, ®) is a group, which we call the external 
direct product of (G, @) and (H, x). If we are using multiplicative notation 
for G and H, we use multiplicative notation for G X H and write 


(X1,V1) (Xo, V2) = (41% 2, V1 V2); 


if we are using additive notation, we write (x,, ¥1) + (%., V2) =(%, +X, 
y, + y,). When additive notation is being used, the group G X H is some- 
times written G ® 4H and called the external direct sum of G and H. How- 
ever, in certain contexts the terms ‘“‘direct sum” and “direct product”’ have 
different meanings. (See Exercise 12.) 

If we have three groups R, S, and 7, we can form the direct products 
R X (S X T) and (R X S) X T. The reader should verify that our identifi- 
cation of these two objects as sets is a group isomorphism. We will generally 
not distinguish between these two groups and write R X S X T. The follow- 
ing is another simple but important fact about direct products. If G, = 
G, and H, =H,, then G, X H, =G, X H,. (See Exercise 2.) 

The group G = Z X Z is easy to work with on an APL system. The 
elements of G are simply integer vectors of length 2. If X and Y are in 
G, then the sum of X and Y is X+ Y and the negative of X is - X. 


X<7 3 O<«Z<-X 
Y*"1 5 “7 3 
X+JY X+Z 

6 2 0 0 


We can represent the elements of Z, X Z, by vectors of length 2 with com- 
ponents in 17. The sum of X and Y is now 7|X+/Y, while the negative of 
Xis7|-X. 


X<+5 2 O<Z<7|-X 
Y<3 4 2 95 
7|X+Y 7|X+Z 


106 GROUPS 


For Z, X Z3 we use vectors whose first components are in 1 2 and whose sec- 
ond components are in 13. The sum of X and Y is NV |X+Y where N<2 3. 


N<2 3 N\X+X 
X<+1 2 0 1 


Let us find the order of 1 1in Z, X Z3. 


A+1 1 O<FE<+N | At+D 
O<B<+N|A+A 1 2 

0 2 O<F<+N|A+E 
O<C+NV|A+B 0 0 

1 O 
O<D<N|A+C 

0 1 


Here we see that the cyclic group generated by A contains all the elements 
of Z, X Z3. By Theorem 5.8, we have Z, X Z, = Z,. Corollary 2, which 
follows, generalizes this observation. 

Let G and A be groups. The projection map of G X H onto d is the 
map taking (x, y) in G X H toy. It is easy to check that this map is a group 
homomorphism with kernel K = {(x, 1)|x e G}. Therefore K is a normal sub- 
group of G X HA and (G X A)/K is isomorphic to H. Similarly, the map 
(x, y) Fx is a homomorphism of G X A onto G with kernel L = {(l, 
y)ly e H}. The maps x(x, 1) and yF (1, y) are isomorphisms of G 
and H onto K and L, respectively. Since 


(x, 1), y= (x,y) =(,¥) G, D, 


elements of K commute with elements of Z, and every element of G X H 
can be expressed uniquely as a product of an element in K and an element 
in L. 

Let G be a group and let K and L be subgroups of G. It is impossible 
for G to be equal to the external direct product K X L, since K and L 
are not subsets of K X L. However, it is possible for G to be isomorphic 
to K X L in a particularly nice way. We say that G is the internal direct 
product of K and L if the map taking (x, y) in K xX L to xy in G is an iso- 
morphism of K x L onto G. 

THEOREM 1. A group @G is the internal direct product of its sub- 
groups K and L if and only if each of the following holds. 

(a) KaAGandLaaG. 

(b) KO L= {1}. 

(c) G=KL. 

Proof. In K X L we have the subgroups K, = {(x, l)|x e K} and L, = 


DIRECT PRODUCTS 107 


{(1, y)ly e L}. By the previous discussion, both K, and L, are normal in 
KX L, K,OL, is trivial,and K X L = K,L,. Let f:K X L—G map (x,y) 
toxy. Then K,f=K and L,f = UL. If f is an isomorphism, then conditions 
(a), (b), and (c) must hold. Now suppose (a), (b), and (c) hold. Let x be 
in K and y in L. Our first task is to show that x and y commute. Consider 
u=x ly xy. Since x is in K and K <G, both x and yxy are in K, 
and so u is in K. But y isin L and L (G and hence x !'y x isin L. There- 
fore u is in L too. Thus u is in K MN L = {1} and uw = 1. This implies that 


yx = yxu = yxx ly “xy =xy, 
and so x and y commute. For all x,, x, in K and all y,, y, in L we have 
[(%1,91) (x,,¥,) 1 f= (X14 Xo, Vio.) FARR XW i)2 = 
XiViXoV. = (0%, Vi) AI ((%2, 72) Sf). 
Therefore f is a homomorphism. By (c) we Know that f is surjective. Sup- 
pose (x, y) is in the kernel of f. Then xy = 1 or x = y7!. But x isin K 


and y ! is in L, sox and y are in K N L. Thus x = y = 1 and the kernel 
of f is trivial. Therefore fis an isomorphism. 0 


If G is the internal direct product of its subgroups K and L, it is cus- 
tomary to write G = K X L. It should always be clear from the context 
whether external or internal direct products are meant. 


COROLLARY 2. Suppose m and n are relatively prime integers. Then 
Zinn =Lm X Ly. 


First Proof. In Z,,, the cyclic subgroup generated by [n] is K = 
{[nx] |x e Z}. Now [nx] = [ny] if and only ifx=y (mod m), and so K has 
order m. Thus K = Z,,. Similarly, L = {[mx]|x € Z} is a subgroup of Z,,,, 
isomorphic to Z,. Since Z,,, is abelian, both K and L are normal in Z,,,. If 
[u] is in K  L, then m and n divide u. Since gcd(m, n) = 1, this means that 
mn divides u, so [u] = [0]. Therefore K /O L is trivial. Finally, given v in Z, 
there are integers x and y such that nx + my = v. Thus [nx] + [my] = [] 
and K + L = Z,,. By Theorem 1, Z,,, is isomorphic to K X L, which is 
isomorphic to Z,, X Z,. 


Second Proof. If x = y (mod mn), then x = y (mod m) and x = y 
(mod n). Therefore we may define a function g from Z,,, to Z, X Zy 
by [xlg = (x]», [x] ), where [x],, and [x], denote the residue classes 
modulo m and n containing x, respectively. It is easily checked that g isa 
homomorphism. By the Chinese Remainder Theorem, g is surjective. If 
Ix]. = [0], and [x], = [0],, then [x] = [0] in Z,,,, so g is injective. 
Therefore g is an isomorphism. [ 


108 GROUPS 


A slight modification of the second proof of Corollary 2 yields the next 
result. 


THEOREM 3. Let m and n be relatively prime positive integers. Then 
Umn = Um X Up. 

Proof. For any x in Z it is easy to see that gcd(x, mn) = 1 if and only 
if gcd(x, m) = 1 and gcd(x, n) = 1. Thus [x] is in U,,, if and only if [x], 
is in U,, and [x], is in U,. Therefore g in the second proof of Corollary 
2 induces a bijection of U,,, onto U,, X U,. Not only does g preserve ad- 
dition, it also preserves multiplication. Therefore the restriction of g to 
Umn isan isomorphism of U,,, onto U,, X U,. 0 


Recall that ¢() is defined to be |U, |. 


COROLLARY 4. If gcd(m, n) = 1, then (mn) = d(m)¢(n). Ifp1,-.., 
Dy are distinct primes andn=p,°!... p,°", e; = 1, then 


b(n) = IL 2,27’ (p; — 1). 


t=] 


Proof. By Theorem 3, if gcd(m, n) = 1, then |U,,| =1U, X U,| = 
[Um | X |U,| and so ¢(mn) = ¢(m)o(v). Therefore, if p,,..., Dp, are dis- 
tinct primes, andn =p, !...p, 7, then 


o(n) = I o(pi') 


It remains to determine ¢(p° ) for p a prime and e > 1. Now gcd(x,p* ) = 

1 if and only if gcd(x, p) = 1, and so the only elements of ZL, e notin U, e are 

the classes [py], yveZ. There are p®—! such classes, so a( pe ) =p* — pe l= 
pe-\(p— 1). O 


Let n be a positive integer. The direct product Z X ZX ...X Z with 
n factors is usually denoted Z” and is a very important group. The ele- 
ments of Z” are integer vectors of length m with componentwise addition 
as the binary operation. In Chapter 6 we will study the subgroups and quo- 
tient groups of Z” extensively. As a result of this investigation, we will be 
able to give a complete description of all finitely generated abelian groups. 
It will turn out that any such group is isomorphic to exactly one group of 
the form | 


Za, X Za,X...X Zi, XZ", 


Where r and m are nonnegative integers, each d; is an integer greater than 
1, and d; divides dj, for 1 < i<r. This result, referred to as the Funda- 
mental Theorem of Finitely Generated Abelian Groups, is Theorem 6.6.7. 


DIRECT PRODUCTS 109 


EXERCISES 

1 Let G and dH be groups. Show that G X H=AH7 XG. 

2 Suppose G,, G,, H,, and A, are groups with G, = G, andH, = 
H,. Prove that G, X H, =G, X A>. 

3 Suppose that G and A are groups and that U is a subgroup of G 
and V is a subgroup of H. Show that U X V isa subgroup of G X 4. 

4 Give an example of a group G with subgroups K and L such that 
G =K X L but Gis not the internal direct product of K and L. 

5 Let A =<a,,...,4@, > bea finitely generated abelian group 
written additively. Show that the map taking (m,,...,m,) in 
Z” to m,a, + ...+ m,a, is a homomorphism of Z” onto A. 
Conclude that any finitely generated abelian group is isomorphic 
to a quotient group of Z” for some n. 

6 Show that any group of order 4 is isomorphic to Z, X Z, or 
to Z,. 

7 Show that any abelian group of order 6 is cyclic. 

8 Let m and n be positive integers and set d = gcd(m, n) and r = 
Icm(m, n). Generalize Corollary 2 by proving that Z, X Z, is 
isomorphic to Zy X Z,. 

9 Let m,,..., m, be positive integers. Use Exercise 8 to show 
that there exist positive integers d,,...,d, such that 

Liny X...X% Lm, =Za, X...X La, 
and d; divides d,,1<i<r. 
10 Which of the following abelian groups of order 72 are isomorphic? 

(a) Zp. (d) ZX Z4 X Lo. 
(b) Z,X Z3X Z,X Zq. (ec) Ze X Zp. 
(c) Zg X Zo. (f) ZX ZL36. 

11 Show Ug =Z, X Z, and Uy, = Zy X Ly. 

12 For each ij in the nonempty index set J, let G; be a group. The 


Cartesian product 
H=II G; 
iel 
is defined to be the set of functions f from J to the union of the G; 
such that for each i the value of f-(/) is in G;. We can define a binary 
operation x on A by (f « g)(i) = f(i)g(i), where the product is 
computed in G;. Show that (H, x) is a group. This group is called 
the direct product of the groups G;. Let K be the set of elements f 
of H such that f(i) is the identity element of G; for all but a finite 


110 


13 


14 
15 


16 


17 


18 


19 


GROUPS 


number of elements i in J. Show that K is a subgroup of H. The 
group K is called the direct sum of the G;. If J is finite, then H = K 
but, if J is infinite, then K is a proper subgroup of #. 


The goal of this exercise is to construct a group table for G = 
Z, X Z;. The nine elements of G are given by the columns of the 
matrix 


If X<VL;Z]1, we can compute J from X by I[+31X. 


O<X<VE 35] 31X 
1 2 5 


Write an APL expression for a rank 3 array SUM such that the 
vector SUML ;I;J7] is the sum of VL ;Z ] and VL ;/ J inG, that is, 
3)}VL;73J+V0;:7]. (Aint. First form 3|Vo.+V and then take a 
suitable generalized transpose.) Our group table for G is the matrix 
T such that TLI;J7] is 31SUML;I;/71], the number of the col- 
umn of V that is equal to SUML;Z;2 J. Construct T. 


Construct a group table for Z, X Ze. 

Let S and T be an M-by-M and an N-by-W group table, respectively. 
Describe a method for constructing a group table for the direct 
product of the groups (1M, S) and (iW, 7). As an example, take 
S and 7 to be G6. 

Compute ¢(10° ). 

Write a procedure PHI to compute ¢(W). 


Let H,,...,A, be subgroups of a group G. We say that G is the 
internal direct product of the H; if the map of H, X ... X A, 
to G taking (h,,...,h,) to h, ...h, 1s an isomorphism. Give 
a condition in the spirit of Corollary 2 that is equivalent to the as- 
sertion that G is the internal direct product of the Hj. 

Let G be a group and let D be the diagonal of G X G. Show that 
D is a subgroup of G X G. Show also that there is a 1—1 corte- 


spondence between normal subgroups of G and subgroups of G xX G 
that contain D. 


7. PERMUTATIONS 


In this section we investigate individual elements of the symmetric group 


PERMUTATIONS 11] 


2(X) on a set X. In the following section we will study subgroups of 2CX). 
Let x,, ..., X, be a sequence of distinct elements of X. Then there 
is a unique permutation f of X such that 


1. f fixes every element of X—{x,,...,X;,}. 
2. Xf = X41, l<i<r 
3. x,f = xX}. 
We denote f by (x,,... ,X,) and refer to f as an r-cycle or a cycle of length 


r. We also say that f cyclically permutes the x;. The cycle notation (*,,..., 
x,) is not uniquely determined by f. For example, if x is an element of 
X, then the 1l-cycle (x) is the identity permutation. If f is not the identity, 
then the set {x,, ...,X,} is uniquely determined, but there are r choices 
for x,. Thus 


f= (%1, 000 Xp) = (XQ, 0 0 Xp, KX) Hee = (Xp, X02, Xp_-1)- 


For example, in the symmetric group 23, there are just two distinct 3- 
cycles: (0, 1, 2) and (O, 2, 1). We say the cyclic permutations (x,,...,Xr) 
and (y,,...,s) are disjoint if one of them is the identity permutation or 
iff >1,s>1,and 


{Xp,00 0, Xp} ON {Vizeee, Veh =O. 
THEOREM 1, Disjoint cycles commute. 


Proof. Let f = (x,,...,X,) and g = (y;,..., ys) be disjoint cycles. 
If either f or g is the identity permutation, then clearly fg = gf. Thus we 
may assume r > 1,s > 1,and {x,,...,x-} NM {y1,...,Vst = D. If x is 
any element of X not in {x,,...,x,} U {(V1,.--.-,Js}, then fand g both 
fix x and x(fg) = x = x(gf). Also, for 1 <i <r, the points x; and x,f are 
fixed by g, and 


x;Ufg) = (xifg = xif = (xig)f = xi (ef). 
Similarly, y;(fg) = y;(gf) for 1 <j < s. Therefore fg and gf agree on every 
element of X so fg=ef. O 


Now let us assume that X is finite and that fis any element of 2X). 
Choose x in X and set X¥9 =X, X; =Xof, X. =X,f, and so on. Clearly, x; = 
xf’. Since X is finite, the elements x), x,, ... cannot all be distinct. Let 
n be the smallest integer such that x, = x,, for some m <n. Ifm > 0, then 


Xm—1f =Xm =Xn =Xn-al 
But f is a permutation of X, and so in particular f is 1—1. This means that 
Xm_—1 = Xn_1, contradicting our choice of n. Thus m = QO. The elements 
Xo, +++, Xn_1 are distinct, and f agrees with the n-cycle f, = (xo, .. 


an | 


112 GROUPS 


Xn—1) on the set {Xo, ...,Xn_ 1}. We call f, acycle of fi Let Cy denote 
{f, |IxeX}. 

THEOREM 2. Let X be a finite set and let f be in 2(X). Suppose x and 
y are in X. Then either f, =f, orf, and fy are disjoint. 

Proof. Let f. = (x;,...,%x,) and fy =()1,...,x). If, and fy are 
not disjoint, then neither f, nor fy is the identity and x; = y; for some i 
and j. But, by our preceding remarks concerning the uniqueness of the cycle 
notation, we may assume i =j = 1. Then, since x44, =x,;f, 1 < k <rand 
Ve+1 = WEF, 1 < k <5, it follows immediately that r =s5 and x, = yz, 1 < 
i<r. O 


Let f be the permutation of 17 given by the vector 
4563 10 2. 
Then Cy has three elements. 
fo =fi =fa =fs = (0,4,1,5), 
fo =f = (2,6), 
f; = (3), 


and f is the product (0,4,1,5) (2,6) (3) of the elements of Cy. As the next 
theorem shows, this is always true. 


THEOREM 3. If X is finite, then any element f of 2(_X) is the product 
of the elements of Cr. 


Proof. Since X is finite, the set Cy is finite. By Theorems | and 2, any 
two elements of Cy commute, so we do not need to specify the order in 
which the elements of Cy are multiplied together to form the product g. 
Let x be in X. We must show that xf= xg. If xf =x, then x is fixed by every 
element of Cy and xg = x. Suppose xf # x. Then f, is the only element of 
Cy, that moves x, and every other element of Cy fixes xf, . Therefore xg = 
xf,,. However, by the definition of f., xf, =xf. Thusf=g. 0 


According to Theorem 3, any permutation of X can be expressed as a 
product of disjoint cycles. This decomposition is essentially unique, the 
only lack of uniqueness coming from the fact that the order of the factors 
may be changed, l-cycles may be omitted and, when r > 1, a given r-cycle 
can be written in r different ways. Suppose we have a product of cyclic 
permutations that are not disjoint, such as 


f = (0,2,4,1,3,5) 1,7,2,8) (0,9,7,6,1) 


in 219. We can easily obtain its decomposition into disjoint cycles. In the 
example, we first compute the cycle f). The first cycle takes 0 to 2, the sec- 
ond cycle takes 2 to 8, and the third cycle leaves 8 fixed. Thus Of = 8. We 


PERMUTATIONS 113 


find that 8f = O and so fy = (0,8). The first point not moved by fo is 1, 
so we compute f,. It turns out that f,; = (1,3,5,9,7,2,4,6); this accounts 
for all the elements of 110. Thus 


f=fohi = (0,8) (1,3,5,9,7,2,4,6). 


The inverse of a product of cycles can be found using the rule (f,... 
fy) =f,71... 77 and the observation that 


(Xp,000 Xp) b = (Xp, Xp ee X1) = (XK , Xpyp ee e  X)- 


An r-cycle has order r. If f,,...,/f, are disjoint cycles and f=f,... fi, 
then f” =f," ... f,” and f” = 1 if and only if f;” = 1 for each i. Therefore 
the order of f is the least common multiple of the lengths of the f;. For 
example, (0,2) (1,3,4) has order 6 while (0,2) (1,3,4,5) has order 4. 

Cycle notation is used extensively for hand computation but almost 
never for calculations in a computer. It is convenient to be able to enter 
permutations at a terminal in cycle notation and have the results of a com- 
putation printed in cycle notation. The procedures GPCYCIN and 
GPCYCOUT can be used to convert between cycle notation and the vector 
notation used by most of the procedures that work with permutations. 


O<X<+7 GPCYCIN C<'(0,5,1.,3)(2,6)'! 
536041 2 

O<Y¥<10 GPCYCIN C 
5360412789 

GPCYCOUT X 
(0,5,1,3)(2,6)(4) 

GPCYCOUT Y 
(0,5,1,3)(¢2,6)(4)(7)(8)(9) 


The cycle representation of a permutation is a character string. Cycles of 
length 1 may be omitted in the second argument of ‘GPCYCTN, so the first 
argument is used to determine the set 14 on which the permutation acts. 
Cycles of length 1 are always listed by GPCYCOUT. The current origin is 
used to decide whether the permutations act on {0,..., W-1}or{l,..., 
NV}. Both GPCYCIN and GPCYCOUT work with just one permutation at 
a time. 


There is one more notation for permutations that is occasionally used. 
If X = {x,,...-,X»}, then the permutation of X that takes x, to y,, x, to 
y,, and so on may be written 


ae 
Yi V2---Yn 


114. GROUPS 


Thus the element (0,3,6,2) (1,4,5) of 2, could be written 
012345 6 6543210 
(choeery) °r (Sie eoasl: 
Sometimes we will want to work with subsets of 2, for small values 
of n. We will represent such a subset A by a matrix whose ith row is the 


ith element of A. For example, the matrix GP8 in EXAMPLES describes a 
set of eight elements of 2, in origin 0. 


GP8 
3 


WN WNRF OO CO 


OrFRPFrN WON WW FP 
FOR OW WN ND 
MOWorRrPrN OF 


The procedure GPSYMG can be used to produce a list of all elements of the 
symmetric group 2,. The result is origin dependent. 


LU<P<GPSYMG 3 L<Q<GPSYMG 3 


MMR ORO 
BPONNOR 
OFORNDND 
WWNRPNE 
DR wWWwrR bd 
EPR ND WW 


ULO<1 UL0<0 


The rows of GP8 actually form a subgroup of 2,. Exercises 13 and 
14 describe how to verify this fact. 


EXERCISES 


1 Find the decomposition as a product of disjoint cycles for the 
following permutations. 
(a) (0,1) (1,2). 
(b) (0,1) (2,3) (0,2,4). 
(c) (1,7,2,8,4) (0,5,2,6,3) (1,5,3,9). 

2 Show that [x] [3x + 2] is a permutation of Z, and determine 
its cycles. 


10 


11 


12 


13 


PERMUTATIONS 115 


What is the inverse of (0,2,5,3,4,8,1,7,6)? 
Let W be a positive integer and set F+1o1N, Ge 161N, and 


H<$1N. Determine the cycles of “, G, and H. Show that G is the 
inverse of F and that FL] is equal to ALG]. 


Let NV be a positive integer and M any integer. Show that all cycles 
of MOi1WN have length N+M ZGCD N. 


Let P be a vector that is known to be a permutation of 11. Show 
that the vectors fF’, G, and 4 defined by 


F<AP 
G<+P1i1N 
H<Npo2 
ALP J<iN 


are all equal to the inverse of P. Which method uses the least CPU 
time for large NV? 


What are the possible orders for an element of 2 ,? 
What is the largest order an element of 2,1. can have? 


Determine the cycles of all the permutations listed in the matrix 
GP8. 


Let A and B be two matrices whose rows are permutations of 
1. Write an APL proposition corresponding to the assertion 
that the set of permutations listed in A is the same as the set listed 
in B, 

Show that a permutation P of 1 is uniquely determined by the 
integer M<V1P. How can P be reconstructed from M and N? We 
call / the packed representation of P. 


Let A be a matrix whose rows are permutations of 11. Write an 
expression for the vector V such that VLIJ is the packed repre- 
sentation of ALI; ]. 

In order to show that the rows of GP8 form a subgroup of 2,, 
we must prove that the product of any two rows is again a row. 
Let 


Q<«GP8L;GP8 ] 


Then Q[I:/7;] is the product of GP8[J;] and GP8[I;:]. The 
most straightforward way to show that each vector QLI:;J7;] is 
a row of GP8 is to form 


C<V/QA.=QGP8 


116 GROUPS 


Check that every entry of C is 1 and explain why this proves 
that the rows of GP8 form a subgroup of 24. (Clearly, this ap- 
proach is appropriate only for very small subgroups of 2, .) 


14 In order to show in Exercise 13 that the rows of GP8 form a 
subgroup of 2X4, we computed the inner product QA.=GP8. 
This calculation requires a significant amount of CPU time. Use the 
ideas in Exercises 11 and 12 to obtain a more efficient test for 
showing that the rows of GP8 are closed under products. 


15 Construct a group table for the subgroup of Y, listed in GP 8. 


8. PERMUTATION GROUPS 


A permutation group on a set X is a subgroup of 2(X). The importance of 
permutation groups is made clear by the following theorem, which states 
that up to isomorphism all groups are permutation groups. This theorem 
is named for English mathematician Arthur Cayley (1821-1895), who first 
gave the definition of an abstract group. 


THEOREM 1 (Cayley). If G is a group, then G is isomorphic to a sub- 
group of 2(G). 

Proof. In Corollary 2.3 we defined for each u in G a permutation 
R, of G by xR, = xu. We will show now that R is an injective homomor- 
phism of G into 2(G). For all x in G we have 


x(Ry Ry) = (xRy)Ry = (xu)y = x(uv)=xRyy, 


so R,R, = Ryy. This means that R is a homomorphism of G into 2(G). 
Suppose wu is in the kernel of R. Then xu = x for all x in G. But if xu =x 
for even one x in G, then u = 1. Therefore R is injective. If we let H be the 
image of G under R, then R is an isomorphism of G ontoH. 0 


The homomorphism R used in the proof of Theorem 1 is called the 
right regular representation of G. The word “‘representation’’ is often used 
to denote a homomorphism of a group H that we know little about into a 
group K that we know more about. In the case of Theorem 1, we represent 
elements of the “‘abstract’’ group G by permutations, which are considered 
to be more “concrete’’. 

In Corollary 2.3 we also defined a permutation L,, of G by the formula 
xL, =u 'x. It is easily shown (see Exercise 1) that L is also an injective 
homomorphism of G into 2(G). We call L the left regular representation 
of G. 

If G is an N-by-W group table, then the image of (1, G) under its right 
regular representation is the set of columns of G, considered as elements 


PERMUTATION GROUPS 117 


of 2. The image of (14, G) under its left regular representation is the set 
of rows of G. 

In Section 3 we discussed the problem of finding the subgroup gen- 
erated by a set of elements in a group defined by a group table G. Suppose 
we are given a matrix S whose rows are elements of 2. How might we 
construct the subgroup of 2, generated by the rows of S? One could imag- 
ine constructing a group table 7 for 2 and using the methods of Section 
3 but, if VW > 4, then the storage requirements of 7 would be prohibitive. 
Moreover, we do not need a group table to compute products in Y yy: Given 
the matrix S of generators, we can form SL ;S], and this rank 3 array gives 
all possible products of two rows of S. If some of these products are new, 
we can add them to the rows of S and repeat the process until we get no 
new permutations. Stated in this form, the procedure still requires quite 
a bit of space and CPU time. However, using the ideas in Exercises 3.13, 
7.11, and 7.12, a procedure can be constructed that works reasonably well 
for small values of VW. The workspace CLASSLIB contains one such pro- 
cedure, GPSGP. 


O<S<GP8l1 23] 


03 2 1 

12 3 0 
GPSGP § 

012 3 

03 2 1 

12 3 0 

103 2 

23 01 

3 21 0 

210 3 

3 01 2 
O<7<2 4o1 23010 2 3 

12 3 0 

102 3 
H<GPSGP T 
of 

24 4 


The procedure GPSGP returns a matrix listing all the elements of the sub- 
group generated by the rows of its single argument. In Section 7 we saw 
that the rows of the matrix GP8 form a subgroup of 2,. Here we see that 
this subgroup is generated by the two elements given in S and that the rows 
of T generate all of D4. 


118 GROUPS 


The techniques used in GPSGP are very crude, and this procedure 
should be used only to investigate small subgroups of 2, where W itself is 
small. Much more powerful algorithms exist for determining the order of 
the group generated by a given set of permutations. However, these algo- 
rithms are beyond the scope of this text, and their computer implementa- 
tion, even in APL, is somewhat complicated. Further information can be 
found in my paper listed in the bibliography. 

One way to look for subgroups of 27 when WN is small is to choose 
small subsets of 2 at random and use GPSGP to find the subgroups gen- 
erated by these subsets. Suppose two elements f and g of 24 are chosen 
randomly and independently. The order of < f, g > must be one of the 
numbers 1, 2, 3, 4, 6, 8, 12, or 24. Let p(m) denote the probability that 
< f, g > has order m. It can be shown that the values of p(m) are given 
by the following table. 


m p(m) 
1 | 1/576=.0017... 
2 | 3/64 =.0468... 
3 | 1/18 =.0555... 
41 5/48 =.1041... 
6 | 1/8 =.125 
8 | 1/8 =.125 
12 | 1/6 =.1666... 
24 | 3/8 =.375 


Thus the most likely value of m is 24 and the next most likely is 12. If 
two random elements f and g are chosen in 2,, with > 4, then the exact 
probability that < f, g > will have a particular order is difficult to com- 
pute and not of much interest. However, the most likely order is n! and 
the next most likely is m!/2. In fact, it has been shown [Dixon] that in 
the limit, as n goes to infinity, < f, g > = 2, with probability 3/4 and 
< f, g > has index 2 in 2, with probability 1/4. This does not mean that 
2», has no subgroups of index greater that 2, only that the other subgroups 
are comparatively small and not very numerous. 

Dixon’s theorem is beyond the scope of this text. However, we will 
prove the important result that 2, always has a subgroup of index 2 when 
n > 2. To do this, we will exhibit a homomorphism of 2, onto a group of 
order 2. Assume n > 2. If fis in 2,, we define the sign of f to be the ra- 


tional number 
sn(fy= I GO=—O1) 
i>j 1—] 
Here the product is over all integers i and j, withO <j <i<n. As an ex- 
ample, the sign of (0,2,1) in 23 is 


PERMUTATION GROUPS 119 


(3) (28)- 29 


and the sign of (O) (1,2) is 


(73) . (5-9) . ()- ] 
1—O 2—0 2—1 
The following theorem justifies the use of the term “‘sign’’. 


THEOREM 2. Let f be an element of 2,, n >2. Then sgn(/f) is either 
l or —1. 


Proof. By definition, we have 


, | I (Cif) — Gf] 
sn(f)= IT] S2= 0) = 7) 
er EST II @-p 


i>j 


Fix 7 and j with O <j <i<n. We wish to show that the factor (i-/ ) in the 
denominator of sgn(f) also occurs, perhaps with its sign changed, in the 
numerator of sgn(f). Let vu =if7! and vy = jf. Then uf =i and vf =j. If 
u > v, then the factor (uf) — (vf) = (i-/) occurs in the numerator of sgn(/); 
ifu <y, then (vf) — (uf) = j—i occurs in the numerator. Choosing a differ- 
ent pair (i, j) leads to a different factor in the numerator, so there is a 1—1 
correspondence between the factors in the denominator and the factors in 
the numerator such that corresponding factors differ at most by a sign. 
Therefore sgn(f)is 1 or —1. 0 


There are other ways of computing sgn(f). Since |sgn({‘) |= 1, it is 
enough to count the number of terms in the product 


ll (if) — Uf) 


i>j L—J] 


that are negative. The factor corresponding to the pair (i, j) is negative if 
and only if if < jf Thus sgn(f) = (—1)”, where m is the number of pairs 
(i,j), with i >j and if <jf. Such a pair is called an inversion in f. The pro- 
cedure GPSGN uses this method to compute the sign of one or more per- 
mutations. 


120 GROUPS 


GP8 
012 8 
0 3 2 1 
12 3 0 
103 2 
23 01 
3 2 1 0 
210 3 
3 01 2 


O+S+GPSGN GP8 


1 1%141%414%1 14 ~¢«42 
GPSGN 3 GPCYCIN '(0,2,1)' 


GPSGN 3 GPCYCIN '(0)(1,2)!' 


Here SLI] is the sign of GP8L TI; J. 

THEOREM 3. If nm = 2, then sgn is a homomorphism of 2, onto the 
group ({1, —1}, X). 

Proof. By Theorem 2, sgn is a map from 2, to {1, —l}. It is easy to 
check (see Exercise 6) that the sign of the 2-cycle (O, 1) is —1. What remains 
is to show that sgn is a homomorphism. Let fand g be in 2, . We must prove 
sen(fg) = [sgn(f)] [sgn(g)]. By definition, 

T] (=e) 


i>j | 


sen(fg) = 


Multiplying by 
Cif) — Gf) 
i>jy Cf)—- GPF) 
and rearranging the denominator, we obtain 
TD @-Gs) JIT @M=-TW 
i>jp GA-GTPA) is; i—j 
_ Ty] (2) — Gis) 
en) GA) 
Fix i and j with i >j and set u =if and vy =jf. Ifu >», then 
(ifg) — (ifg) _ (ug) — (vg) 


(if) —Gf) u—v 
is a factor of sgn(g). If u <v, then 


sen(fg) = 


PERMUTATION GROUPS 121 


(fg) — Gfg) _ (ug) — (vg) _ (vg) — (ug) 
(if) — Gf) u—vy v—Uu 


is again a factor of sgn(g). Thus 


(ifg) — Gg) _ 
s GA—Gfy ~ 2m 


and sgn(fg) = [sgn(f)] [sgn(g)]. O 

The kernel of sgn is called the alternating group and is denoted A,. 
The index of A, in 2, is 2, and A, is normal in x,. Elements of A, are 
said to be even permutations, while elements of 2, — A, are said to be 
odd. The reason for this terminology will be discussed in Section 10. 

Now we will show how a permutation group G on a set X defines an 
equivalence relation on X. If x and y are in X, let us write x ~ y if there 
exists an element g of G such that y = xg. 


LEMMA 4. The relation ~ is an equivalence relation on_X. 


Proof. If e is the identity permutation on X, then e e G and x = xe. 
Thus ~ is reflexive. If y = xg, then x = yg™! and ~ is symmetric. If y = 
xg and z = yh, then z = x(gh), proving that ~ is transitive. 0 


The equivalence classes of ~ are called the orbits of G, and the orbit 
containing a point x of X is xG = {xglg e G}. The set G, = {g e Glxg =x} 
is a subgroup of G and is called the stabilizer of x in G. 


THEOREM 5. Let G be a permutation group on X. If x is in X, then 
there is bijection from the orbit xG containing x to the set of right co- 
sets of G, in G. 


Proof. Every element of xG has the form xg for some g in G. We would 
like to map xg to the coset G,g, but is this well defined? If xg = xh, then 
xgh"! = x and gh" is in G,. This implies G,g = G,h, so our map is well 
defined. The map is clearly surjective. All that remains is to verify injectiv- 
ity. If G.g = G,h, then h = ug for some u in G,. Therefore xh = x(ug) = 
xg, and the map is injective too. 0 

COROLLARY 6. If, in Theorem 5, the set X is finite, then |xG| = 
|G:G,.|. In particular, |xG| divides|G|. O 


Suppose the rows of the matrix P give the elements of some subgroup 
G of Xj. Then the orbit of G containing a point J is given by the vector 
SSORT PlL;IJ]. However, suppose we are given only a set of generators 
for G, say as the rows of a matrix S. One method for computing the orbit 
of G containing J would be to form P+GPSGP S.But P may be a very large 
matrix that will not fit in a workspace. It is possible to determine the orbit 
containing J without constructing P first. The method is sketched in Exer- 


122 GROUPS 


cise 12. The procedure GPORBIT is based on this approach. The first argu- 
ment of GPORBIT is a matrix S whose rows list a set of permutations. 
The second argument is a single integer J. The result of GPORBIT is a vector 
listing the points in the orbit containing J of the group generated by the 
rows of Ss. In addition, the characteristic vector of the orbit is produced 
as the global variable x. 


P20 
61612 0 8411418131519 9 5 2 14144 j44 #%717 #3 £10 
12 718 21013 0 14 4191511 «4617 5 814 16 3 «9 
GPCYCOUT P20[0;3] 
(0,6,18,3)(1,16,7,13)(2,12)(4,8,15)(5,11)(9,19,10)(14)(17) 
GPCYCOUT P2013] 
(0,12,6)(1,7)(2,18,3)(4,10,15,8)(5,13,17,16,14)(9,19)(11) 
P20 GPORBIT 0 
02 3612 18 
P20 GPORBIT 1 
1571113 14 16 17 
P20 GPORBIT 4 
4u 891015 19 
x 
000010001411%10000310001 


In this example the rows of P20 are elements of 2). The orbits of the 
group generated by these permutations are {0,2,3,6,12,18}, {1,5,7,11, 
13,14,16,17}, and {4,8,9,10,15,19}. 

Given a matrix of permutations, we can use the procedure GPALLORB 
to obtain a summary of the orbits of the group G generated by these permu- 
tations. 


GPALLORB P20 


O14 
6 8 6 
Q) 
O100484 140144 44 10224124140 4 
(Q=1)/120 


15 7 11 13 14 #16 17 


The procedure GPALZORB returns a matrix with two rows. The first row 
lists the first element of each orbit of G and the second row lists the number 
of elements in each orbit. In addition, GPAZLLORB constructs a global vec- 
tor @ such that @L JZ] is the first point in the orbit of G containing J. Thus 
Q=QL IJ] is the characteristic vector for the orbit containing I. The pro- 


PERMUTATION GROUPS 123 


cedure GPALLORB plays an important part in the calculations done in 
Section 9, 

At this point it will be helpful in our discussion of permutation groups 
to look at an example. Let the vertices of a regular polygon P with n sides 
be labeled from 0 to n — 1 as follows. 


A symmetry of P is a permutation of the vertices that maps edges into 
edges. For example, the symmetry a = (0,1,2,...,m — 1) corresponds to 
a clockwise rotation of P through an angle of 27/n radians. The symmetry 
b=(0) (i,n —1)(2,n — 2)... is the reflection in the line through vertex 
O and the center of P. The set G of all symmetries of P is a subgroup of 
2». since a is in G and the powers of a map 0 to each of the other vertices, 
G has one orbit: {0,1,...,n — 1}. By Lagrange’s Theorem (Theorem 3.9), 
we have |G| = |G:G,|IG, | and, by Corollary 6, we know that |G:G,|=n. 
The only vertices joined to vertex 0 by an edge are 1 and nm — 1. Thus any 
element of Go either fixes 1 or maps 1 to n — 1. Thus the orbit 1G, of Gy 
containing 1 has at most two elements. Since b is in Gg and b interchanges 1 
and n — 1, the orbit 1G, has exactly two elements. If we let Gy, =(Go),, 
the set of elements in G that fix both 0 and 1, then [Gy| = |G o:Go;1| Go, |= 
2IGo, land so |G| = 2n|Go,|. If x is in Go,, then x fixes O and 1. Thus x fixes 
2 because 2 is the only vertex other than 0 connected to |. But then x fixes 
3, since 3 is the only vertex besides 1 connected to 2. Continuing in this 
manner, we find that x fixes every vertex. Thus IG), |= 1 and IG|= 2n. 
The group G is called the dihedral group of order 2n and is denoted D>,. 
(Some authors write D,.) Exercises 17 and 18 provide more information 
about D>,,. 

Cycles of length 2 have a significant role in the study of symmetric 
groups. The term transposition is often used to refer to a 2-cycle. 


124 GROUPS 


THEOREM 7. For any integer n > 1, the symmetric group 2», is gen- 
erated by the transpositions (0,1), (1,2),...,(" —2,n —1). 

Proof. Let G = <(0,1l),...,( —2,n — 1)>. Clearly, G = 2, if n= 
2. We will assume n > 2 and proceed by induction. For any i <n — 1, the 
image of m — 1 under the product (n —1,n—2)(n—2,n—3)...@+1,0 
is i. Thus {0,...,m — 1} is an orbit of G. Consider the stabilizer G,_ of 
n — 1 in G. The transpositions (0,1),...,(n —3,n — 2) are all in G,_ 
so, by induction, the order of G,_, is at least (n — 1)!. However, by The- 
orem 5, the index |G:G,_,|ism. Thus 


n! >|Gl| = |G:Gy,_1||G,r_1 |>nt™— 1)!] =a. 
Thus |G| =|2,|andG=2,. 0 
The following slight generalization of Theorem 5 will be used in the 
next section. 


THEOREM 8. Let G be a group, let X be a set, and let f:G-—~>2(X ) be 
a homomorphism. For any x in X there is a bijection from the orbit con- 
taining x of the image Gf to the set of right cosets of the stabilizer G, = 
{ue G|x(uf) =x}. 

Proof. We know by Theorem 5 that there is a bijection from x(Gf) 
to the set of right cosets of (Gf), in Gf But G, = [(Gf),]f7 and so, 
by Corollary 5.11, there is a bijection from the set of right cosets of (Gf), 
in Gf to the set of right cosets of G, inG. O 


EXERCISES 


1 Show that the left regular representation of a group G is an in- 
jective homomorphism of G into 2(G). 


2 Let G be a group and let H and K be the images of G under the 
right and left regular representations of G, respectively. Prove that 
elements of H commute with elements of K. 


3 In Exercise 2 show that K is the centralizer of H in 2(G) and that 
Hf is the centralizer of K. (See Exercise 3.8.) 

4 Use GPSGP to find the orders of the groups generated by 
(a) (0) (1,2,3,4,5) and (0,1) (2) (3,4) (5). 
(b) (0,1,3,2,5,6,4) and (0) (1) (2) (3,4) (5,6). 

5 The procedure 


VM<+TRIAL N3S 
[1] S<(2,N)o(N?N) NPN 
[2] M<1tpGPSGP SV 


10 


11 


12 


13 


14 
15 
16 


PERMUTATION GROUPS 125 


chooses two elements of 27 at random and returns the order of 
the group generated by them. Execute TRIAL 5 atleast 20 times 
and use the results to estimate the probability that two elements 
of 2, chosen randomly generate 2.. 


Let n be an integer greater than 1. Show that the sign of the 2- 
cycle (0,1) in 2, is —1. 

Count the inversions in the following permutations. 

(a) (0,3,4) (1,2). 

(b) (0,7) (1,6) (2,5) (3,4). 

(c) (0,3,5) (2) (4,6). 

(d) (0,7,1,6,2,5,3,4). 

Let P be a vector that is a permutation of 1pP. Write an APL 
expression for the number of inversions in P. 


Show that the product of two even permutations is even, the 
product of an even and an odd permutation is odd, and the product 
of two odd permutations is even. Show also that sgn(f ! ) = sgn(/) 
for all fin Dp. 

We defined 2, to be the symmetric group on the set {0,...,” — l} 
and we showed that 2, has a subgroup of index 2 when 7 2 2. 
Suppose X is any finite set with |X| = 2. Prove that D(X) has a 
subgroup of index 2. 

Let n > 2. Show that the probability that two elements chosen 
randomly in 2, generate 2, is at most 3/4. 


Let X be a finite set and let U be a subset of DCX). Set G = <U>. 
Prove that if x is in X, then the orbit xG is the smallest subset 
Y of X such that x e Y and for all y in Y and all u in U the image 
yu is in Y. Use this idea to find the orbits of the subgroup <u, v> 
of X19, where 

u = (0,6,8) (1,5) (2,7) (3,9) (4), 

vy = (0) (1) (2,6,7) (3) (4,9) (5) (8). 

Let G be the subgroup of 2.9 generated by 54150, 6150, and 
4 0,2+150. What are the orbits of G? What is |G:G, |? (Try 
to solve the problem by hand and then check your answer using 
GPALLORB.) 

Show that 2, is generated by (0,1) and (0,1,...,n —1). 

Show that 2, is generated by (0,1) and (1,2,...,n—1). 

Let G = (X, E) be an undirected graph. An automorphism of 
G is an isomorphism of G with itself. (See Section 1.5.) Show 


126 


17 
18 


19 


20 


*21 


*22 


23 


GROUPS 


that the set Aut(G) of all automorphisms of G is a subgroup of 
X(X). 


List the elements of the dihedral group D,. 


Let a and b be the elements of the dihedral group D>, defined 
in the text. Show that a” = b? = 1 and that bab = a™!. Prove 
that every element of D», can be expressed uniquely as b’a’, 
with O <i < 1 and O <j <n. Show that every subgroup of <a> 
is normal in D»,. 


Let G and H be permutation groups on X and Y, respectively. We 
say G and A are isomorphic as permutation groups if there is a 
bijection @:X—~Y and an isomorphism f:G——-AH such that (xu)@ = 
(x6) (uf) for all x in X and all u in G Show that 2, has two 
cyclic subgroups that are isomorphic as groups but not as permu- 
tation groups. 


Let G be an N-by-W matrix defining a binary operation on 14. Show 
that G is a group table if and only if there is a two-sided identity 
element and the set of rows of G is a subgroup of 2 y. 


The time needed to verify that a given n-by- matrix G is a group 
table directly from the definition of a group is proportional ton, 
since associativity must be checked for all possible triples of ele- 
ments. Show that it is possible to decide whether G is a group table 
in a time proportional to n?log,n using an approach based on the 
following outline. 

(a) Let U= {0,1l,...,m — 1} and for x, yin Uletxy =G[x,y]. 

(b) Find, if possible, a two-sided identity e. 

(c) Find, if possible, a subset XY of U such that |X| < log, n, 
for each x in X the map yl ~yx is in 2,, and the group gen- 
erated by these permutations has U as an orbit. 

(d) For all x in X and y and z in U, check that (yz)x = y(zx). 


Construct an algorithm for deciding whether an n-by-n matrix is 
a group table such that the execution time for the algorithm is 
proportional to n?. 

Let X be a finite set and let G be a subgroup of Y(X). For g in 
G let x(g) be the number of elements fixed by g. Show that 


dX x(g)=kIGl, 
geG 


where k is the number of orbits of G on_X. The function x 1s called 
the permutation character of G. Hint. Let U= {(g,x)lgeG,x eX, 


GRAPHS WITH A SMALL NUMBER OF VERTICES 127 


xg = x}. Thus x(g) is the number of elements of U whose first 
component is g. Therefore 


> x(g) = 1UI. 


However, for a given x in X the pair (g, x) is in U if and only if 
g isin the stabilizer G,. Thus 


Ul= dX |Gyl. 
xexX 
Use Theorem 5 to show that if Y is an orbit of G, then 
x 1G, = Gl. 
Y 


x€é 


*9 GRAPHS WITH A SMALL NUMBER OF VERTICES 


In this section we will show how the theory of groups developed so far can 
be used to determine, up to isomorphism, the undirected graphs with n 
vertices, where n < 5. 

Let X be a set and let Y = 2X, the set of all subsets of X. Suppose 
fis in 2(X). If A is in Y, that is, if A C X, then {afla € A} is in Y also. 
Thus we may define a function f: Y—~Y by Af = {afla € A}. 

LEMMA 1. The map :f->f is a homomorphism of D(X) into r( Y). 

Proof. Suppose f and g are in D(X) and A is in Y. Then 

A(f og) =(Af)g = (faflae A})g = ((af)gla € A} 
= {a(fog)|aeA}=ACF og). 
Thus f og = f og. Let e be the identity function on X. Then Ae = {aela € A} = 


{ala e A} = A, and e is the identity function on Y. Now, taking g =f"!, 
we have 


fofl=foft =e=ft of=fT of. 
Therefore f is a permutation of Y by Corollary 1.3.4 and maps 2(Y) 


into D(Y). Since fog =fog,  isahomomorphism. 0 


It is easy to see that if we had defined Y to be the set of all subsets 
of X with a given number, say k, of elements, then we would again have 
obtained a homomorphism of 2X) into YCY). 

Now let n be a positive integer and let X = {0,1,...,m” — 1}. Define 
Y to be the set of two-element subsets of X and Z to be the set of all sub- 
sets of Y. If E is an element of Z, then E is a set of two-element subsets 
of X and (X, £) is an undirected graph. As before, we have homomorphisms 
of 2, = X(X) into L(Y) and of Y(Y) into 2(Z). For f in Z,, let f be the 


128 GROUPS 


image of f under the composition of these homomorphisms. Thus, if E is 
in Z, then 


Ef = {{xf, vf} | {x,y} € E}. 
Let H be the image of 2, under _, that is, {f| fe D,}. Since H is a subgroup 
of 2(Z), we may define the orbits Z,,..., Z, of Hon Z. 

THEOREM 2. Let G = (U, F) be an undirected graph with n vertices. 
Then G is isomorphic to a graph (X, E), where E is in Z. Moreover, two 
graphs (X, E) and (X, E’), with E and E’ in Z, are isomorphic if and only 
if E and E’ are in the same orbit of H. Up to isomorphism there are ex- 
actly r undirected graphs with n vertices. 

Proof. Since |U| = |X | = 7, there is a bijection g:U—~X. Let E = 
{{ug, vg} |{u, v} e F}. Then E is in Z and g is an isomorphism of G onto 
(X, E). Now let E and E’ be any two elements of Z. The graphs (X, £) and 


(X, E’') are isomorphic if and only if there is a permutation f of X such 
that 


"= {{xf, yf} | {x,y} ¢ E} = Ef, 
that is, if and only if F and FE’ are in the same orbit of H. 0 


Let us apply Theorem 2 for some small values of n. If nm = 1, then 
LX| = 1, |Y| = 0, and |Z| = 1. Clearly, there is up to isomorphism only one 
undirected graph with one vertex. If m= 2, then |X| =2,| Y| =1, and |Z| = 
2. The two elements of Z correspond to graphs that are obviously noniso- 
morphic. Thus, up to isomorphism, there are two graphs having two vertices. 
Ifnm = 3,then X = {0,1,2} and Y = {a,b,c}, where 


a= {0,1}, 
b= {0,2}, 
c= {1,2}. 
There are eight subsets of Y, so Z consists of the sets 
Eo = QD, E, = {a}, 
Ey _ {c}, Es = {a,c}, 
E, = {5}, E, = {a5}, 


E3 — {b,c}, E, = {a,b,c}. 


(The numbering has been chosen to agree with our APL formulation that 
follows.) Now, by Exercise 8.14, the symmetric group 2, is generated by 
x = (0,1) and y = (0,1,2). The group A is therefore generated by the images 
x and y of x and y under the homomorphism from 3 to 2(Z). To com- 
pute x, we first note that x induces x’ = (a) (b,c) on Y. To see what x does 
to a particular set E;, we simply apply x’ to the elements of E; and check 
in the list to see which set £; is obtained. A simple computation shows that 


GRAPHS WITH A SMALL NUMBER OF VERTICES 129 


x = (Eo) (£1, £2) (£3) (£4) (Es, £6) (Er), 
which we will abbreviate as 
x = (0) (1,2) (3) (4) (5,6) (7), 


writing i instead of E;. Similarly, we see that y induces y’ = (a,b,c) on 
Y and 


y = (0) (1,2,4) (3,6,5) (7) 
on Z. The orbits of H = <x, y> are 
{O}, {1,2,4}, {3,5,6}, {7}. 


Thus, up to isomorphism, there are four graphs with three vertices. Repre- 
sentatives for the isomorphism classes are given by (X,E£;) for i = 0,1,3,7. 
The corresponding diagrams follow. 


0 0 0 0 
e 


5 eo WA 


e 
1 2 1 2 1 2 1 2 


For n = 4 we have |X| = 4, |Y| = 6, and |Z| = 64. The computations 
that were easy for n = 3 now are somewhat tedious to carry out by hand. 
Let us try to reformulate, using APL, what we did for n = 3 in a way that 
can be used for larger n. The elements of Y are listed in the following matrix. 


U<S<2 SSUB N<3 


re OO 
KO ND FR 


In general 2, is generated by x = (0,1) and y = (0,1,2,...,n — 1), which 
are given by the vectors 


U<X<1 0,241N O<+Y¥<10O1N 
10 2 1 2 0 


Our first task is to determine x’ and y’, which give the action of x and y 
on Y. Now the images of the rows of S under X are given by the rows of 


XLS] 


ORB 
Nm oO 


130 GROUPS 


Unfortunately, the matrix XLS] is not just S with its rows permuted. The 
entries of the first row have also been permuted. This illustrates the prob- 
lems that can arise when we list the elements of sets instead of using char- 
acteristic vectors. The matrix 


U<T<N SCHV S 


ORE-B 
Kb OP, 
EEO 


gives the characteristic vectors for the elements in Y. If we let 


U<U<N SCHV XLS] 


ROP, 
ORE, 
KB Oo 


then U gives the characteristic vectors for the images of the elements of 
Y under x. The matrix U has the form 7[X1; J] for a uniquely determined 
permutation X1. To obtain X1 explicitly, we use a trick. Let 


O<+B<219T7 
6 5 3 


Then BLI] is the integer whose binary digits are given by TLI; 1], and 
BLIJ] uniquely determines J. Therefore, if 


O<+C<210U 
6 3 5 


then CLI] must be BLX1LZ] J]. But, by the definition of the dyadic 
operation 1, this means that 


H<+xX1<BiCc 
0 2 1 


To compute the vector Y1 corresponding to y’ on Y, we calculate 


O<V<+N SCHV YLS] L<D<21QV 
O11 3 6 5 
1 1 0 U<¥1i<BiD 
101 2 0 1 


The elements of Z are subsets of the rows of S. We may identify these 


GRAPHS WITH A SMALL NUMBER OF VERTICES 131 


with the subsets of 11+ 9S, the index set for the rows of S. The charac- 
teristic vectors of the elements of Z are therefore given by the rows of 


U+R<8(302)T18 


PRER OOOO 
PROOKROSO 
POROFORS 


To describe x, we need to determine the vector XB such that REXB[I];] 
is the characteristic vector for the image under X1 of the set whose char- 
acteristic vector is RL I; ]. 

LEMMA 3. Let @ be the characteristic vector for the subset of 1™ listed 
in the vector A. If P isa permutation of 1M, then QL AP] is the characteristic 
vector for PLA]. 

Proof. Let QP<Q@LAP]. The Ith component of @P is QLJ], where 
J is (AP)LIJ. Now AP is the inverse of P, so I is PLJ]. Since QL] is 
1 if and only if J is in A, it follows that QPL] is one if and only if J is 
the image under P of an element J in A, that is, if and only if J is in 
PLA]. U 


By Lemma 3, the matrix 


O+RX+RO;3AX1) 


0 0 0 
O 1 0 
0 0 1 
O11 
1 0 0 
1 1 0 
1 0 1 
1 11 


is such that RXLZI;] is the characteristic vector for the image under X1 
of the set whose characteristic vector is RLZI;]. Since we constructed 
R so that RLI; ] is the vector of binary digits for 7, we can find out which 
row of Ff is equal to RXLI;] by considering RXL ZI; ] to be a number in 
binary notation. Thus 


O+XB<2LQRX 
021434 65 7 


132 GROUPS 


Note that XB does give the permutation x that we computed previously. 
We can also perform the check 


A/,RX=RLXB; } 
1 


To compute the vector YB corresponding to y, we form 


O<+YB+21QRC;A4Y1)] 
O24 6143 5 7 


Again the result checks with our earlier calculation. 

All that remains is to compute the orbits of the group H generated 
by XB and YB. To do this, we construct a matrix whose rows are our gen- 
erators and apply GPALLORB. 


U<+P+(2,0XB)pXB,YB 
0213 4 65 7 
024613 5 7 

GPALLORB P 
O13 7 
13 3 1 


The matrix returned by GPALLORB indicates that H has four orbits with 
representatives 0, 1, 3, and 7 and that the lengths of these orbits are lI, 
3, 3, and 1, respectively. All of this is in agreement with our previous com- 
putation. 

This same calculation can be repeated with N=4 and with a mod- 
erately large workspace for N=5 as well. This and other similar investiga- 
tions are left as exercises. Exercise 10.23 describes a method for computing 
the number of nonisomorphic graphs with n vertices when n > 5. 


EXERCISES 


1 Use the method presented in the text to determine up to isomor- 
phism the undirected graphs with four vertices. Draw a diagram for 
one graph of each isomorphism type. 

Do Exercise 1 for the graphs with five vertices. 


3 Letn, X, Y, Z, and H be asin the discussion preceding Theorem 1. 
Let E be in an orbit of H on Z that contains m elements. Show 
that the order of the automorphism group of the graph (X, £) is 


CON) UGACY 133 


n!/m. Determine the orders of the automorphism groups of the 
graphs diagrammed as part of Exercises 1 and 2. 


*4 Find generating elements for Aut(G) for the representative graphs 
G obtained in Exercises | and 2. 


5 Show how the methods of this section can be modified to con- 
struct the nonisomorphic graphs with a specified number of ver- 
tices and a specified number of edges. 


6 Suggest ways of reducing the amount of work required to de- 
termine the graphs with six vertices. For example, describe the 
graphs with six vertices in which each vertex is connected to at 
most two other vertices. 


7 Let U be the set of all 3-clement subsets of 16 and let V be the 
set of 3-element subsets of U. We have homomorphisms 2 ,.—>2Z(U) 
and 2(U)-—>2(V). Let H be the image of 2, under the composi- 
tion of these maps. Determine representatives for the orbits of H. 


10. CONJUGACY 


The relations of right and left congruence modulo a subgroup are 
examples of equivalence relations defined on the elements of a group. We 
will now describe another important equivalence relation associated with 
any group. 

Let G be a group and let x and y be elements of G. We say that x and 
y are conjugate in G if there is an element z of G such that y = z~!xz or, 
equivalently, such that xz = zy. 


THEOREM 1. In any group G conjugacy is an equivalence relation. 


Proof. For any x in G we have x = 17!x1, and so conjugacy is re- 
flexive. If y = z7!xz, then x = zyz7! = (z7!)71yz7!, so we also have sym- 
metry. Finally, if y =u ~!xu and z = v7! yy, then 


z=vtua'xuyv = (uv) x (uy), 
proving that conjugacy is transitive. U 


The equivalence classes of the conjugacy relation are called conjugacy 
classes. If x and z are in G, then y = z7!xz is called the conjugate of x by 
z. Note that y = x if and only if x and z commute. 

Suppose G is an W-by-l group table. Then two integers J and / are 
conjugate in (1W, G) if and only if GLIT;K ]=GLK3;J] for some X in iV. 
This corresponds to the proposition v/GLI; J=GL ;J], which can also be 
written as GLI; ]v.=GL3;¢]. Thus the characteristic matrix for conjugacy 
in(iWV,G)isGvV.=G. 


134 GROUPS 


H24<G24UV.=G24 
SHQREL E24 
1 


Here we have verified that conjugacy is an equivalence relation in (124, 
G24), 

There is an important connection between the conjugacy classes of a 
group G and the isomorphisms of G onto itself. Such an isomorphism is 
called an automorphism of G, and the set of all automorphisms of G is 
denoted Aut(G). 


THEOREM 2. For any group G the set Aut(G) is a subgroup of 2(G). 


Proof. By the definition of isomorphism, any automorphism of G 
is a permutation of G and thus is an element of 2(G). The identity map 
on G is an automorphism of G, so Aut(G) is nonempty. Finally, by The- 
orem 4.3, we know that Aut(G) is closed under composition and inverses, 
so Aut(G) is therefore a subgroup of 2(G). O 


If z is an element of the group G, we will denote by tz the map of G 
into itself, taking x in G to z~!xz. Thus x and y are conjugate if and only 
if y =xtz for some z in G 


_ THEOREM 3. For each z in G the map tz is an automorphism of G. 
Moreover, T is a homomorphism of G into Aut(G). 


Proof. We must first show that tT, is in Aut(G). Suppose x and y 
are in G. Then 


(xy)t, =z xyz =z" xzz 7 yz =(xtz) (ytz) 
and hence t, is a homomorphism of G into itself. If 1 = z!xz = xt,, 
then x = zz! = 1 and 1, is therefore injective. Finally, for any x in G, 
we have x = z-'zxz7'z = (zxz7!)t,. Thus 1, is surjective and 1, is in 
Aut(G). Now, for any x, z, and w in G, we have 


X(Tz Tw) =(X Tz) Ty = wiz) xzw = (zw)! x(zw) = xTy- 
Therefore tz ty = Tzy, and t isa homomorphism of G into Aut(G). U 


The kernel of t is called the center of G and is denoted Z(G). Clearly, 
Z(G) is the set of elements in G that commute with every element of G. The 
image in Aut(G) of G under 1 is called the group of inner automorphisms 
of G and is written Inn(G). By the First Isomorphism Theorem, Inn(G) 
is isomorphic to G/Z(G). It is not hard to show that Inn(G) is a normal 
subgroup of Aut(G). (See Exercise 33.) The quotient group Aut(G)/Inn(G) 
is called the outer automorphism group of G. The term “outer automor- 
phism” has two common meanings and may refer either to an element of 
Aut(G)/Inn(G) or to an element of Aut(G)—Inn(G). 


CONJ UGACY 135 


Two elements x and y of a group G are conjugate in G if and only 
if they are in the same orbit of Inn(G). We define the centralizer Cg (x) of 
x in G to be the set of elements in G that commute with x. It is easily shown 
that Cc (x) is a subgroup of G. (See Exercise 3.8.) 


THEOREM 4. If x is an element of the finite group G, then the number 
of conjugates of x in G is |G:Cg(x)|. 


Proof. The number of conjugates of x is the number of elements in 
the orbit of Inn(G) = Gr that contains x. Since Cg (x) = {z € G|x =xTz}, 
the theorem follows from Theorem 8.8. [( 


Let us turn now to the problem of determining the conjugacy classes 
of 2(X), where X is a finite set. 


LEMMA 5. Let g and &/ be elements of 2(X) and suppose (x,,.. 
x,)isa cycle of h. Then (x,g,...,X;,g) isa cycle of g thg. 


Proof. This is equivalent to saying that if h takes x to y, then g 'hg 
takes xg to yg. But, if xh = y, then 


(xg) (g“* hg) = x(gg hg) = x(hg) = (xh)gi= ye. O 
THEOREM 6. Let X be a finite set. Two elements of 2(X) are conjugate 
if and only if they have the same number of cycles of each length. 


Proof. Let g and h be in 2(X). If the cycles of h are 


(Xe eee eX) Vip ee Vs) (Zi, 00 e  Zpdove, 
then, by Lemma 5, the cycles of g hg are 


(X18, 0 ee XB) (Vi 8, 2 ee Veh) (218,20 2 ZtZ) one, 


so h and g'hg have the same number of cycles of each length. Now sup- 
pose h’ is an element of X(X) with cycles 


(Xtee eee Xp) Vip e ees Ve) (Zp pee eg Zplevee 
Then the symbol 
Xp X_. Xp Vy... Ve Zz... Zp... 
xp xy VEZ 1. Zee 


represents an element g of >(X) such that h’=g "hg O 


a A 


A partition of the positive integer n is a sequence m,,...,m, Of 
positive integers such that m, < m2z<...<m,andn=m,+m,+...+m,. 


COROLLARY 7. The number of conjugacy classes of 2, is equal to the 
number of partitions of n. 


Proof. The lengths m,,..., m, of the cycles of an element of 2, 
form a partition of n (when arranged in increasing order) and, by Theorem 


136 GROUPS 


6, two elements of 2, are conjugate in 2, if and only if the corresponding 
partitions of n are the same. (1 


We can now state a simple condition for a permutation in 2, to be 
even. 


THEOREM 8. An element g of 2,,, 1 = 2, is even if and only if g has an 
even number of cycles of even length. 


Proof. Let U be the group ({1, —1},X) and let sgn:2,—-U be the sur- 
jective homomorphism defined in Section 8. If f and g are in 2,, then 


sen(f~ gf) = sgn(f) * sgn(g)sgn(f) = sgn(g), 
since U is abelian. Thus sgn(g) depends only on the conjugacy class of g and 
hence only on the cycle structure of g. As we saw earlier, the transposition 
(0,1) is odd and, therefore, so is every transposition. Since the transposi- 
tions generate 2, , an element of 2, is odd or even according to whether 
it can be written as a product of an odd or an even number of transposi- 
tions. Now 


(Xp pe ee Xp) = (%1,X2) (%1,%3) ~~. (%1, Xp). 
Therefore an r-cycle is the product of r—l transpositions, so cycles of 
even length are odd permutations and cycles of odd length are even permu- 
tations. Thus an element of 2, is even if and only if it has an even number 
of cycles of even length. U 


Suppose JN is a normal subgroup of a group G. If x is in N and g is in 
G, then the conjugate g-!xg of x is in g-!'Ng = N. Thus N is a union of 
conjugacy classes. We say G is simple if G # {1} and the only normal sub- 
groups of G are G and {1}. It is not hard to show (see Exercise 1) that 
G is simple if and only if G is nontrivial and G = <C> for every conjugacy 
class C # {1}. Since the kernel of a homomorphism is a normal subgroup, 
a homomorphism from a simple group into any other group is either in- 
jective or trivial. 

Let G be a nontrivial finite group. Among the proper normal sub- 
groups of G, let N be one of largest order. By Theorem 5.10, the quotient 
group G/N has no proper nontrivial normal subgroups, so G/N is simple. 
If N is nontrivial, then we can find a normal subgroup M of N such that 
N/M is simple. This process can be repeated until the trivial subgroup of 
G is reached. Thus G gives rise to one or more finite simple groups from 
which G is in some sense built up. Therefore we can consider the finite 
simple groups as the basic building blocks from which all finite groups are 
constructed. For this reason the determination of the finite simple groups 
has been one of the central problems in the theory of finite groups for 
nearly 100 years. This project has recently been completed. 


CONJUGACY 137 


Abelian simple groups are easy to describe. They are the cyclic groups 
of prime order. There are many infinite families of finite, nonabelian simple 
groups. The exercises contain an outline of a proof that the alternating 
groups A, with n = 5 are nonabelian simple groups. In addition, there are 
26 finite, simple groups that do not fit nicely into any infinite family. The 
existence of many of these groups is difficult to establish, and machine com- 
putation was used to prove that some of them exist. The paper by Gorenstein 
in the bibliography gives an overview of the classification of the finite 
simple groups. In particular, Chapter II of that paper describes all of the 
finite, nonabelian simple groups. 

So far we have talked only about conjugacy of elements in a group. We 
can extend the notion to subgroups quite easily. Two subgroups H and 
K of a group G are conjugate if there is an element g of G such that K = 
g He. It is not difficult to show that conjugacy is an equivalence relation 
on the set of subgroups of G and that if G is finite, then the number of con- 
jugates of a given subgroup AH is the index in G of the stabilizer Ng (A) = 
{g e Glg-! Hg = H}.We call Nc (A) the normalizer of H in G, since Nc (A) 
is the largest subgroup of G in which H is normal. Thus Ng (A) consists of 
the elements g of G for which the right coset Hg is the same as the left coset 
gH. An element or a subgroup of Nc (#7) is said to normalize H. 

Repeated computation of centralizers and normalizers is one method 
for discovering subgroups in a given group. For example, in (160, G60), 
the element 1 has order 2. 


GTINIT G60 
GTSGP 1 
0 1 


The centralizer 4 of 1 has order 4. 


O<+H#<+(G60L1;)]=G60L;1)])/160 
O41 56 59 


To construct the normalizer K of H, we first find the characteristic matrices 
for right and left congruence modulo H 


R+GTRCON F 
L+GILCON # 


and then obtain K as the set of group elements J for which the right coset 
of # containing J is the same as the left coset of # containing J. 


O<+K<+(A/R=L)-/160 
01 22 23 29 30 31 32 36 37 56 59 


138 GROUPS 


Since K has order 12, the number of conjugates of H in (160, G60) 1s S. 
The following theorem summarizes some elementary facts concerning 
automorphisms. 


THEOREM 9. Let t be an automorphism of the group G. If x isinG, 
then x and xt have the same order. If H is a subgroup of G, then AH and 
Ht are isomorphic. If H < G, then Ht < G and G/(ArT) is isomorphic to 
G/H. 

Proof. Since t fixes 1 and (x”)t = (xT), it follows that x” = 1 if and 
only if (xt)” = 1. Thus x and xt have the same order. The restriction 
tly of t to A is an isomorphism of H onto Ht. Finally, if H <i G, let 
t:G—>G/H be the natural map and let o = t7!o 7m. Then o is a homomor- 
phism of G onto G/H. The element g of G is in the kernel of 7 if and only if 
gt! is in the kernel AH of 7 or, equivalently, g is in Ht. Thus the kernel 
of o is Ht. Therefore Ht <G and, by the First Isomorphism Theorem, 


G/(Ht) =G/H. UO 

COROLLARY 10. Conjugate elements of a group have the same order 
and conjugate subgroups are isomorphic. 

Proof. Since conjugation by a group element is an automorphism, 
the corollary is an immediate consequence of Theorem 9. LU 

We close this section with an application of Theorem 4. If p is a prime, 
then a p-group is a finite group whose order is a power of p. 

THEOREM 11. If G is a nontrivial p-group, then Z(G) is nontrivial. 


Proof. Let C,,...,C, be the conjugacy classes of G and suppose the 
numbering has been chosen so that C, = {1}. Set c; = |C;|. Then c, = 1 
andc, +...+c, =|G|=p” for some integer m > 1. Thus 

p” —l=ce,+...+¢. 
By Theorem 4, each c; divides |G| and thus each ¢c; is a power of p. Since 
m >1,p does not divide p” — 1. Therefore some c; with i > 2 is not divisible 
by p. However, the only power of p that is not divisible by p is p® = 1. 
Thus c; = 1 and C; = {x}, where x # 1. But this means that Coe(x)=G, 
so x is in the center of G. Hence Z(G) #1. O 


EXERCISES 
1 Show that a group G is simple if and only if G # {1} and every 
conjugacy class other than {1} generates G. 
2 Prove that the center of 2, is {1} forn > 2. 
3 Let H be a subgroup of G and suppose C = Cg (A) and N = Ng (A). 


*4 


*6 


10 


11 


12 


13 
14 
15 
16 


17 


18 


CONJUGACY 139 


Prove that N/C is isomorphic to a subgroup of Aut(A). (See Exer- 
cise 5.5.) 


Suppose G is a finite group with more than two elements. Show 
that Aut(G) #1. 


Determine the conjugacy classes in 2, ,” <.7. 


Let G be a group and let R:G—>2X(G) be the right regular repre- 
sentation of G. Let N be the normalizer in 2(G) of the image of 
G under R. Show that the stabilizer N, in N of the identity ele- 
ment 1 of Gis Aut(G). (The group JN is often called the holomorph 
of G.) 


Determine the automorphism groups of the groups Z;, Z, X Z,, 
Z,,and Z,. 


Show that every automorphism of 2, is an inner automrphism. 
Prove that Aut(Z,, ) is isomorphic to U,,. 
Let g be an element of 2, with k, cycles of length n,, k, cycles 


of length n,, ... . Show that the number of conjugates of g in 
Ly 1s 
n! 
k, k 
ny ns.. . kK 'k,!. oe 


Let x by an n-cycle in 2,,. Prove that the centralizer of x in 2» is 
<x>. 

Is it possible to find elements x and y in A, such that x and y are 
conjugate in 2, but are not conjugate in A,? 

Suppose G is a group and G/Z(G) is cyclic. Show that G is abelian. 
Prove that if p is a prime, then any group of order p? is abelian. 
Fill in the details of the proof of Theorem 9. 


Let N be a normal subgroup of the group G By Theorem 5.10, 
there is a 1—1 correspondence between the subgroups of G con- 
taining N and the subgroups of G/N. Suppose a subgroup K of 
G corresponds to a subgroup L of G/N. Show that Ng (K) cor- 
responds to Ng /z (L). 


Let H be a proper subgroup of the p-group G. Prove that Nc (A) 
contains H properly. [Hint. Consider the two cases Z(G) contained 
in H and Z(G) not contained in H. | 


Let H and K be groups and let o:H—Aut(K) be a homomorphism. 
Denote the image of h under o by o;. Show that the set H xX K 


140 


19 


20 
21 


22 


23 


GROUPS 


with the product(;,k,) (h2,k2) = (Ah, (ki Gn, ) k,) is a group. 
(Such a group is called a semidirect product of H and K.) 
Determine all possible semidirect products of H and K where 

(a) H=Z;,K=Z, X Z,. 

(b) H=Z, X Z,,K=Z;. 

(c) H=Z2,,K=Z3. 

(See Exercise 7.) 

How many nonisomorphic groups of order 12 arise in Exercise 19? 


Using the matrices G24v.=G24 and G60 v.=G60 and the tech- 
niques of Exercise 3.35, determine sets of representatives for the 
conjugacy classes in the groups (124, G24) and (160, G60). How 
many elements are in each class? Show that (160, G60)is a simple 
group. 

Let X and G be as in Exercise 8.23 and let x be the permutation 
character of G. Show that if g and h are conjugate elements in 
G, then x(g) = x(h). Suppose g,,... , & are representatives for 
the conjugacy classes of G and c; = |Cg (g;)|. Prove that the number 
of orbits of G is 


X xsi) - 


i=] Ci 


r 


Let X = {0,...,n — l}, let Y be the set of two-element subsets 
of X, and let Z be the set of all subsets of Y. For g in 2,, let 
g be the element of =(Y) induced by g and let ¢ be the element 
of X(Z) induced by g. Forn < 7, choose representatives g,,..+,& 
for the conjugacy classes in X,. (See Exercise 5.) Determine the 
cycle structure of g; and, from this, the value of the permutation 
character x(g;). Using Theorem 9.2 and Exercise 22, compute the 
number of nonisomorphic graphs with n vertices. 


The next four exercises outline a proof that A, is simple for n > 3,n #4. 


24 


25 


26 
27 


Show that A, is generated by 3-cycles. (It suffices to show that a 
product of any two transpositions can be written as a product of 
3-cycles. Why?) 


Prove that if N is a normal subgroup of A, and N contains a 3- 
cycle, then N=4A,. 


Show that A; = Z,; and that A, is not simple. 


Assume that n = 5 and that AN is a nontrivial normal subgroup 
of A,. Let g be a nonidentity element in N. By raising g to a power 


THE SYLOW THEOREMS 141 


if necessary, we may assume g has prime order p so that all cycles 
of g have length p or 1. Prove: 


(a) If g fixes two points x and y and moves a third point z, then 
u~'g ug is a 3-cycle in N, where u = (x, y,Z). 

(b) Ifp = 5 and (x1, x2,...,Xp,) is a cycle of g, then gu gu 
is a 3-cycle in N, where u = (x,,X2,X3). 

(c) If p = 3 and case (a) does not apply, then g has at least two 
3-cycles (x,, X>, X3) and (v1, Yo, ¥3) and g-'u-'gu isa 5- 
cycle in N, where u = (x,,X5,)}1). 

(d) If p = 2, then g has at least two 2-cycles and either a fixed 
point or two more 2-cycles. If g = (x) (1, 2) (21,22) «++, 
then g-!u-'gu is a 5-cycle in N, where u = (x, y;, Z,). Ifg = 
(x1, X2) (1,2) (21,22)... , then u'g!ug is a noniden- 
tity element of N fixing at least two points, where 


u = (%1,%2,)1). 


Now conclude that N must contain a 3-cycle and N =A,, by Exer- 
cise 25. 


28 Let G be an N-by-W group table. Write an APL expression for an 


29 


*30 


31 


32 


33 
34 


N-by-N matrix C such that CLI;J] is the conjugate of J by J in 
(1V, G). Assume that the vector JV lists the inverses of the 
group elements. 


Show that an abelian group is simple if and only if it is cyclic of 
prime order. 


Prove that (1690, G60) has an outer automorphism. 


Let G be a permutation group on the set X and let x and y be two 
elements of X in the same orbit. Show that the stabilizers G,. and 
Gy are conjugate in G. 

Let H and K be conjugate subgroups of the group G. Show that 
Cg (7) is conjugate to Cg (K) and Ng (A) is conjugate to Ng (K). 
Show that Inn(G) < Aut(G) for any group G. 


Determine all subgroups of 2,4. Into how many conjugacy classes 
do they fall? 


11. THE SYLOW THEOREMS 


Lagrange’s Theorem (Theorem 3.9) states that if G is a finite group of order 
n and m is the order of some subgroup of G, then m divides n. The converse 
of this result is false. It is not generally true that if m is a positive divisor 
of n, then G has a subgroup of order m. However, the theorems that will be 
proved in this section show that if we assume in addition that m is a power 


142 GROUPS 


of a prime p, then G does, in fact, have a subgroup of order m. Moreover, 
if m is the largest power of p dividing n, then all subgroups of G of order 
m are conjugate in G. A p-subgroup of G is a subgroup whose order is a 
power of p. Our existence theorem on p-subgroups depends on the fol- 
lowing number-theoretic lemma. 


LEMMA 1. Let 1 be a positive integer and let p be a prime. If p® 
divides n, then the largest power of p dividing the binomial coefficient 


( nf ) is the same as the highest power of p dividing n/p’. 
Proof. Since 


a 

n\ _/n (n=1)(n=2) (~— —| 

p° (" p’ —1 p’ —2 7 l ’ 
it suffices to prove that for any integer i, 1 < i < p*, the power of p in 
n — i is the same as the power of p in p® — i. Suppose p? divides n —i. 
If b >a, then p* divides n — i and, since p* divides n, this means that p* 
divides i, contradicting our assumption that 1 < i <p’. Thus b <a. Hence 
p? divides 7. Therefore p? divides p® — i. Essentially the same argument 


shows that if p? divides p* — i, then p? divides i and p* and so divides 
n—-i 

THEOREM 2. Let G be a group of order n and let p be a prime. If 
p’ divides n, then G has a subgroup of order p’. 

Proof. Let P denote the set of subsets of G with p’ elements. If g is 
is in -G and A is in P, then Ag is in P. We can define a homomorphism 
o:G—>2(P) by A(go) = Ag. The number of elements in P is (2 ). If p” is 
the highest power of p in n, then by Lemma 1 we know that p p®~ is the 
highest power of p in |P|. Now P is the union of the orbits of Go, so there 
must be an orbit 0 such that |0| is not divisible by p?~¢t!. Let A be an 
element of 0 and let H be the stabilizer of A in G; that is, 


H = {ge G|Ag = A}. 
Then |G| = |0| |A| and p’ divides |H|. Hence |H| = p*. Fix x in A. If g is 
in H, then xg is in A and g is in x !A. Thus |A| < |x 7!A| =p*. Therefore 
|H| = p*, and we have found a subgroup of G of order p*. OU 
A Sylow p-subgroup of a finite group G is a p-subgroup of G whose 
order is the largest power of p dividing |G|. [Ludwig Sylow (1832-1918) 
was a Norwegian mathematician. |] 


COROLLARY 3 (First Sylow Theorem). If G is a finite group and p isa 
prime, then G possesses a Sylow p-subgroup. L 


THE SYLOW THEOREMS 143 


In order to prove the remaining Sylow theorems (Theorems 6 and 7 
that follow), we will need two lemmas. If H and K are subgroups of a group 
G, then the set HK may or may not be a subgroup of G. The following 
lemmas give a formula for |HK| and state a sufficient condition for HK to 
be a group. 

LEMMA 4. Let A and K be subgroups of the finite group G. Then 
|H| X |K| 

INN K\ 


Proof. The set HK is a union of right cosets of H of the form Hx with 
x in K. If x and y are in K, when does Hx = Hy? We know that Hx = Hy if 
and only if xy~! is in H, which is equivalent to xy! being in HN K, since 
xy! is always in K. Thus Hx = Hy if and only if (WH N K)x = (AHN K)y. 
Therefore the number of right cosets of H in HK is the same as the number 
of right cosets of HM K in K, which is |K: HNK| or |K|/|H ON K|. Every 
right coset of H has |H| elements, and the formula now follows immedi- 
ately. 


LEMMA 5. Let H and K be subgroups of a group G and suppose H is 
contained in Nc (K). Then HK = KH and HK is a subgroup of G. 


Proof. Since K < Nco(K), KH is a subgroup by Theorem 5.12a. If 
hisin H, then Kh=hK,soKH=HK. {J 


THEOREM 6 (Second Sylow Theorem). The number of conjugates 
of a Sylow p-subgroup of a finite group G is congruent to 1 modulo p. 


Proof. If p does not divide |G|, then G has exactly one Sylow p-sub- 
group, the trivial subgroup. Thus we may assume p divides |G|. Let H be 
a Sylow p-subgroup of G and let S denote the set of conjugates of H in 
G, All elements of S are Sylow p-subgroups of G. Let t:H—>2(S) be the 
homomorphism defined by K(ht) = h Kh for all K in S and all h in Z. 
By Theorem 8.8, all orbits of Ht on S have lengths that are powers of 
p. Suppose {XK} is an orbit of Ht of length /. Then, for all h in H, we have 
h" Kh = K and so, by Lemmas 5 and 6, the set HK is a subgroup of G of 
order pe where |H| = |K| = p® and |HN K|= p’. Since |HK| must 
divide |G|, we have 2a — b < a or a<b but, since H  K is a subgroup 
of H, we have b < a. Thus a = b and HON K =H, that is, H = K. Thus S 
consists of {H} and a number of orbits of Ht all of whose lengths are divis- 
ible by p. Thus | S| is congruent to 1 modulo p. O 


THEOREM 7 (Third Sylow Theorem). Let H be a Sylow p-subgroup 
of the finite group G and let K be any p-subgroup of G. Then K is con- 
tained in a conjugate of H. In particular, all Sylow p-subgroups of G are 
conjugate. 


|HK| = 


144 GROUPS 


Proof. Let S denote the set of conjugates of H in G. By Theorem 6 
we know that |S] = 1 (mod p). Now K induces a group of permutations 
on S by conjugation. All orbits of this group have lengths that are powers 
of p. Since p does not divide | S|, there must be an orbit {Z} with one ele- 
ment. Then K normalizes L and, as in the proof of Theorem 6, the set 
KL is a p-subgroup of G that contains L. This means that KL = ZL and KC 
L. If K happens to be a Sylow p-subgroup of G, then |K| = |Z|, sok =Z. 
Thus K is conjugate toH. (1 


We close this section with a proof that the congruence (p — 1)! = 
—1 (mod p) holds for any prime p. This result in number theory is known 
as Wilson’s Theorem. [John Wilson (1741-1793) was an English mathe- 
matician. | 


THEOREM 8. Let G be a finite abelian group written multiplicatively 


and let 
z= Il eg, 
gzeG 


the product of the elements of G. Then z = 1 unless G has exactly one 
element of order 2, and, in this case, z is that element. 


Proof. Since G is abelian, the set H of elements x in G with x? = 
is a subgroup of G. By Theorem 2, the order of H is a power of 2. If y is in 
G —H, then y # y!, and y and y! occur as distinct factors in the product 
defining z and cancel each other. Thus z is the product of the elements of 
HA. lf H = {1}, then z = 1, andif H = {1, x}, then z = x. Thus we may assume 
4 divides |H|. Let x be any nonidentity element of H and set K = <x>= 
{1, x}. The product of the elements in any coset Ky = {y, xy} of K in 
H is x, since y* = 1. Thus z = x”, where m = |H:K| is even. Therefore 
z=1. O 


COROLLARY 9 (Wilson’s Theorem). If p is a prime, then (p — 1)! = 
—l (mod p). 


Proof. If p = 2, we have the assertion 1 = —1 (mod 2), which is cer- 
tainly true. Thus we may assume p = 3. The product of the elements of 
the abelian group U, is [1] [2] ... fp — 1] = [@ — 1)!]. By Theorem 


8, we need only show that [—1] is the unique element of order 2 in Up. 
Clearly, [—1] has order 2. If [x] has order 2, then x* = 1 (mod p) and 
p divides x? — 1 = (x + 1) (x — 1). Therefore p divides x + 1 or x — 1 
and hence x = +1 (mod p). Since [1] has order I, [—1] is the only element 
of order 2in U,. U 


The following calculation illustrates Wilson’s Theorem. 


11|!10 
10 


THE SYLOW THEOREMS 145 


EXERCISES 

1 Determine the number of Sylow p-subgroups of 2, for p = 2 
and 3. 

2 Find a Sylow 2-subgroup and a Sylow 3-subgroup in 2, and deter- 
mine the number of conjugates of each. 

3 Find one Sylow p-subgroup in (160, G60) for p = 2, 3, and 5 
and determine the number of conjugates of each group. 

4 Let AH and K be subgroups of a group G. Show HK is a group if 
and only if HK = KH. 

5 Let G be a group of order 15. Prove that a Sylow 3-subgroup and 


a Sylow 5-subgroup of G are normal and conclude that G = Z,.. 
Give examples of other composite integers n for which every group 
of order n is cyclic. 


_4- 


RINGS 


In our study of groups we referred to the example (Z,+) quite often. Be- 
sides the binary operation +, the set of integers has another binary oper- 
ation, multiplication. The operations + and X on Z are closely related. For 
example, for all x, y, z, in Z, the familiar distributive law x X (Y + z) = 
xX y + xXz holds. In this chapter we will study sets with two binary 
operations that possess many of the properties that hold for addition and 
multiplication in Z. The APL index origin will normally be 0. 


1. DEFINITION AND EXAMPLES 


A ring is a triple (R, +, X ) consisting of a set R and two binary operations 
+ and X on R satisfying the following conditions. 


1. (R, +) is an abelian group. 
2. (R, X) is a semigroup. 
3. The distributive laws 


xX(Vtz)=xXVtxXXZ 
(Xx +V)XZ=XXZ+yYXZ 
hold for all x, y, and z in R. 


We use additive notation for (R, +) and multiplicative notation for (R, X), 
with the symbol X usually omitted. Thus we will normally write xy for 
xX y. When the operations + and X are clear from context, we will refer 
to KR as the ring. However, it must be remembered that a given set may be 
the set of elements of more than one ring. 

The most familiar examples of rings are the sets Z, Q, R, and C of 
integers, rational numbers, real numbers, and complex numbers, respec- 
tively, with the usual operations. These rings all have a multiplicative iden- 
tity, an element 1 such that 1x = x1 =~-x for all x in the ring. The set 2Z of 
even integers is a ring but has no multiplicative identity. In this book we 
will be concerned almost entirely with rings having multiplicative identities. 


146 


DEFINITION AND EXAMPLES 147 


We will therefore adopt the convention that the word ‘“‘ring’’ will mean ‘‘ring 
with multiplicative identity”’ unless there is an explicit statement to the 
contrary. 

Although the reader should have some familiarity with complex num- 
bers, it is perhaps useful to review the definition of the ring C. An element 
z of C is represented as a + bi, where a and 5b are real numbers and i has the 
property that iz? = —1. We call a the real part and b the imaginary part 
of z. If w=c +di, then 


zt+w=(atc)+(bt+a)i, 
zw = (ac — bd) + (ad + be)i. 
Thus if z = 2 — 3iandw=1+2i, thenz+w=3 —iandzw=8 +i. 
The preceding examples of rings are all infinite. Finite rings also exist. 
THEOREM 1. For any positive integer n the set Z,, is a ring. 


Proof. In Section 2.3 we defined binary operations + and X on Z,, 
and we have already shown that (Z,,, +) is an abelian group and that multi- 
plication is associative. Clearly, the congruence class [1] is a multiplicative 
identity. All that remains is to check distributivity. If x, y, z are in Z, then 


[x] X Uy] +[z])=([x] X [vy +2z] =[xQ0 +2Z)] = ley + xz] = 
[xy] + [xz] = [x] X[y] + [x] X[zZ]. 
The other distributive law now follows from the commutativity of mul- 
tiplication in Z, . Thus Z, isaring. 


Addition in a ring R is commutative by definition. Thus if f is a func- 
tion from the finite set X to R, then the expression 


x f(x) 

xeX 
is unambiguous, since the order in which the terms are summed does not 
matter. However, the expression 


Il fq) 
xeX 


may be ambiguous and is not used unless R is known to be a commutative 
ring, one in which multiplication is commutative. We will see some ex- 
amples of noncommutative rings shortly. 

As the next lemma shows, many elementary facts about Z and Q are 
true in any ring. However, a property such as (xy)” =x” y” for any positive 
integer n does not generally hold, since it depends on the commutativity 
of multiplication. 


LEMMA 2. Let x and y be elements of a ring R. Then 


148 RINGS 


(a) Ox = x0=0. 
(b) x(—y) = (—x)y = —(xy). 
(c) (—x) (-y) = xy. 
(d) (-1)x = —. 
Proof. (a) Since 0 is the identity element of (R, +), we have 0 =0+0. 
Thus 
x0 =x(0+0)=x0+x0. 
By the cancellation laws in (R, +) it follows that xO = 0. Similarly, Ox = 0. 
(b) By distributivity, 
xy +x(—y) =x(y — y) = x0 = 0. 
Thus x(—y) is the additive inverse of xy, that is, —(xy). Similarly, (—x)y = 
—(xy). 
(c) By (b), 
(—x) (—y) = —(x(-y)) = —(— y)) = xy. 
(d) Again by (b), 
(-1l)x =-Ux)=-—x. U 


Let us consider some more examples of rings. For the most part, 
proofs of the assertions made will be left as exercises. 


Example 1. If R = {x}, then defining x + x and x X x both to be x makes 
R into a ring in which 1 = 0 =x. Wesay R is trivial. The ring Z, of integers 
modulo | is an example of a trivial ring. 


Example 2. The set of polynomials ay + a,X + a,X?* +...+a,X” with 
integer coefficients is a ring under the usual operations of polynomial ad- 
dition and multiplication. This ring is denoted Z[X]. Rings of polynomials 
will be studied in Section 3. 


Example 3. Let M,(Z) denote the set of 2-by-2 matrices with integer en- 
tries. If A and B are in M,(Z), then A + B is obtained by adding corres- 
ponding entries and AB is the usual matrix product. In APL notation the 
sum of A and B is A+B and their product is 4+.xB. With these binary 
operations M,(Z) is a ring. If we define the elements U and V of M,(Z) by 


_ O+U+2 2p1 2 3 0 O<«V+2 29 210 2 
1 2 “2 4 
3 0 0 2 


then, computing the products U+ .xV and V+. JU, 


DEFINITION AND EXAMPLES 149 


U+.xV V+. xU 
2 3 1 4 
6 0 


we see that M/,(Z) is a noncommutative ring. More general matrix rings will 
be studied in Section 4. 


Example 4. Let C[0O,1] be the set of continuous real valued functions on 
the interval [0,1]. For f and g in C[0,1], define f+g and fg by 


Ff +g) (x) =f(x) +g(x), 
(fg) (x) =f (x)g(x). 


Then C[0,1] is a commutative ring. 


Example 5. Let (R, +, X) be any ring. Define a new binary operation x on 
R by xX x y=y Xx. Then (R, +, «) is a ring, called the opposite ring of 
R,and is denoted R°?. 


Example 6. Small finite rings may be described by giving addition and 
multiplication tables for them. The workspace EXAMPLES contains two 
8-by-8 matrices PLUS and TIMES, which in origin O define binary oper- 
ations On 1 8, 


PLUS TIMES 
0123 4 5 6 7 00000 0 0 0 
103 25 4 7 6 O123 4 5 6 7 
23 03467 4 5 0202 02 0 2 
3 21076 5 4 03 2144 7 6 5 
45670 1 2 3 O04 2644 0 6 2 
594 761403 2 0505 05 0 5 
6 74 5 23 0 141 0624 4 2 6 0 
7 659 4 3 2 1 =0 07 07 07 0 7 


The triple (18, PLUS, TIMES) is aring. 
Most of the computations necessary to check the ring axioms for 
(18, PLUS, TIMES) are quite straightforward. For example, 


A/,PLUS=QPLUS A/PLUSLO;J=18 
1 1 


shows that addition is commutative and that 0 is an additive identity ele- 
ment and 


150 RINGS 


A/,TIMESULTIMES; J)=TIMESL;3TIMES ] 
1 

A/(TIMESC1;J=18).,7IMESL31J]=18 
1 


shows that multiplication is associative and that 1 is a multiplicative iden- 
tity. 

The distributive laws are a little more complicated. The first distribu- 
tive law states that x(y + z) = xy + xz. To check this in our example, we 
must verify that | 


TIMESLX;PLUSLY;ZJ]] = PLUSLTIMESLX;Y1];TIMESLX;Z]] 
for all X, Y,and Zin 18. Now 


TIMESLX3;PLUSLY3Z]] 
is a typical entry in the array TIMESL[ ; PLUS] while 


PLUSLTIMES(X;YJ;TIMESLX3Z])] 


is a typical entry in the generalized transpose 


O01 0 2QPLUSLTIMES;TIMES ] 
Thus the calculation 


A/,TIMESL3PLUS]=0 1 0 2QPLUSLTIMES ; TIMES ] 
1 


verifies the first distributive law in (18, PLUS, TIMES). The verification 
of the remaining axioms is left to the reader. 


Example 7. Let R, and R, be rings. We have defined what is meant by the 
direct sum R, @ R, as an abelian group. If we define (x,, 1) X (*2, y2) 
to be (x, X X2,V1 X y2), then R, @ R, becomes a ring. 


Let R be a nontrivial ring, one with 1 #0. An element u of R is called 
a unit if u has a two-sided inverse v in the monoid (R, X ). By Theorem 3.1.4, 
v is unique and may be denoted u~!. By the same theorem, the set U of units 
of R is a group under multiplication. The group of units of Z is {1, —1}. In 
Q every nonzero element is a unit. The group U,,n > 2, defined in Section 
3.1,1s the group of units in the ring Z,,. 

A division ring is a ring with 1 #0 in which every nonzero element is 
a unit. Noncommutative division rings exist, but we will not see any until 
Section 5.4. A commutative division ring is called a field. The rings Q, 
R, and C are fields. Recall that the inverse of the nonzero complex number 
a+ bi is 


DEFINITION AND EXAMPLES 151 


a bi 
a* +b? aq? +b? 


If p is a prime, then by Theorem 3.1.5 every nonzero element of Z, is a unit. 
Thus Z, is a field. Fields make up the most important class of rings. If 
a and b are elements of a field and b #0, then ab™ is often written a = b 
or a/b. 

It is possible for the product of two nonzero elements of a ring to be 
0. For example, in Z, the product of [2] and [3] is [6] = [0] = 0. If 
R is a commutative ring, then an element a of R is called a zero-divisor if 
a #0 and for some nonzero element b of R the product ab is 0. An integral 
domain is a commutative ring with 1 #0 that has no zero-divisors. Every 
field is an integral domain. The ring Z is an example of an integral domain 
that is not a field. 


Theorem 3. A finite integral domain is a field. 


Proof. Let D be a finite integral domain. For any a in D with a #0 we 
must show ab = 1 forsome b in D. Define a map f:D——D by f(X) = ax. Sup- 
pose for some x and y in D we have f(x) = f(y). Then ax =ay or a(x — y)= 
0. Since a is not a zero-divisor, we must have x — y = 0 or x = y. Thus 
f is injective. However, for maps of the finite set D into itself injectivity is 
equivalent to surjectivity. Thus f is surjective and so, in particular, there is 
an element b of D such that f(b) = ab = 1. Hence Disa field. U 


It is also true that any finite division ring is a field. This fact, originally 
proved by J. H. M. Wedderburn (1882-1948), who was born in Scotland and 
worked in the United States, is much harder to prove than Theorem 3. 
An excellent discussion of Wedderburn’s Theorem can be found in Herstein. 

In the proofs of certain results about determinants that are presented 
in Section 6 we will need to expand products each factor of which is a sum. 
A simple example is (a + b) (c +d). Ifa, b, c, and d are all elements of a ring 
R, the distributive laws imply that (a + b) (c +d)=ac tad + be + bd. Simi- 
larly, the product 


(a, + a, +a;)(b, +b, + b3)(c; + Cy + ¢3) 


can be written as the sum of the 27 terms a;b;cx,, where 1 < i,j,k < 3. The 
following theorem describes the general situation. 


Theorem 4. For 1 <i<mand1 <j <n let by be an element of the 
ring R and seta; =b, +...+b;,. Then 


QQ, ...am = > b11p%2,20 eee bmmp+ 
p 


where p ranges over all n” functions from {1,...,m}to {l,...,n}. 


152. RINGS 


Proof. The formula states that a,a, ... a, is the sum of all possible 
products c,C, ... Cm, where c; is a summand of a;. A rigorous proof of 
this result can be obtained using induction on m and is left as an exercise. 


One final remark is required concerning notation. It is possible to de- 
fine vectors, matrices, and arrays of higher rank with entries lying in a 
fixed ring R. If A and B are arrays of this type, then —A, A +B, A —B, 
and A X B will denote the results of entry-by-entry calculations in R. In 
particular, the symbol X will never be used to denote matrix multiplication. 


EXERCISES 
1 In each of the Examples 1 to 7 show that all of the ring axioms 
are Satisfied. 
2 Determine the group of units for the rings Z[X] and C[O, 1]. 
3 Let R bearing. Describe the units of the opposite ring R°? . 
4 What are the units in the ring (18, PLUS, TIMES)? 
5 


Let R, and R, be rings whose groups of units are U, and U,, re- 
spectively. Show that the group of units of R; @ R, is U, X U3. 


6 Is the direct sum of two integral domains an integral domain? 

7 Show that Z[X] is an integral domain. 

8 Prove that Z, is an integral domain if and only if n is a prime. 
9 


aE 


is a unit in M,(Z). Find A”. 

10 Let X bea set and R be a ring. Let S be the set of all functions 
f:X—R and define operations of addition and multiplication 
on S by 


The matrix 


Ff +2) (x)=fl(x) + 2(x), 
(fg) (x) = f(x)g(x). 
Show that S' is a ring. 


11 Prove that the Binomial Theorem holds in any commutative ring 
R.. That is, for alla, b in R and all positive integers n, 


12 Let X bea set and let P be the set of all subsets of X. For A, B in 
P, define 


13 


14 


15 


16 


17 


18 


DEFINITION AND EXAMPLES 153 


A+B=(A-—B)U(B-A). 
AB=A OB. 
Show that P is a ring. 


A Boolean ring isa ring R in which x? =x for all x in R. Show that 
every Boolean ring is commutative. The ring in Exercise 12 is a 
Boolean ring. [George Boole (1815-1864) was an English logician. ] 


Let A be an abelian group. An endomorphism of A is a homo- 
morphism from A to itself. Let End(A) be the set of all endo- 
morphisms of A. If f and g are in End(A), define f+ g by a(f +g) = 
af + ag and set fg =f o gso that a(fg) = (af)g. Show that both ft+g¢g 
and fg are in End(A) and that End(A) is a ring. 


Let R be a ring and let G bea finite group. The group ring R[G] of 
G over R is the set of all functions from G to R with the following 
operations. If fand g are in R[G], then 


(ft+g)(x)=f(x) +2e(x), 
(fg) (x) = > f(y)g(z). 
yz=x 


Here the sum in the definition of fg is over all pairs (y,z)inG XG 
such that yz = x. Show that R[G] is a ring. It is standard practice 
to identify an element x of G with the characteristic function of 
the subset {x} of G, that is, the element f, of R{[G] such that 


fl, yx, 
lo, p#x. 
Prove that under this identification G is a subgroup of the group of 
units of R[G]. 
Let G be the group (16, G6) and let S be the group ring Z[G]. 
Elements of S may be represented by integer vectors of length 6. 
Suppose A and B are elements of S. Write an APL expression for 
the product of A and B in S. What is the product of 

A+ 2°34 0 1 2 and B+ 1°034 ° 21 
in S? 
Show that in any ring the following generalization of the distribu- 
tive law holds. 


f(y) = 


n 


(> ay; (> b) = > ajb;. 


J=1 Lf 
Prove Theorem 4. 


154 RINGS 


2. SUBRINGS AND HOMOMORPHISMS 


Early in our study of groups we introduced the concepts of a subgroup, a 
normal subgroup, a homomorphism, and a quotient group. In this section 
we will make the corresponding definitions for rings. 

Let (R, +, X) be a ring. A subring of R is a subset S of R containing 
the identity element 1 of R such that S is a subgroup of (R, +) and a sub- 
monoid of (R, X). Thus S is closed under sums, negatives, and products. 
For example, Z is a subring of Q, which is a subring of R, which is a subring 
of C. If a subring happens to be a field, it is often referred to as a subfield. 

In defining subgroups of a group G we did not have to make the ex- 
plicit statement that the identity element of G must belong to every sub- 
group. This followed automatically. Since we are assuming that all of our 
rings have multiplicative identity elements, it is convenient always to have 
the identity of a subring be the same as the identity of the larger ring. The 
following example shows that we must make this a part of the definition. 
Let R = Z @ Z and let S = {(x, 0) |x e Z}. Then S is a subset of R that is 
closed under addition, subtraction, and multiplication and S is a ring under 
these operations. However, the identity element (1, 0) of S is not the same 
as the identity (1, 1) of R, so we do not consider S to be a subring of R. 

Here are some additional examples of subrings. 


Example 1. The set of complex numbers x + yi with x and y in Z is a sub- 
ring of C. This subring is denoted Z[i] and called the ring of Gaussian 
integers. [German mathematician Carl Friedrich Gauss (1777-1855) is 
considered to be one of the greatest mathematicians of all time. ] 


Example 2. We can generalize Example 1 slightly as follows. Let m be an 
integer and let ,/m denote a fixed square root of m in C. The set of com- 
plex numbers x + y,/m with x and y in Z is a subring of C and is denoted 
Z[./m].If m > 0, then Z[,/m] is a subring of R. Normally, we assume 
that |m| > 1 and that m is square free, that is, not divisible by the square 
of an integer greater than 1. 


Example 3. In M,(Z) the set of matrices of the form 
a Ob 
lo C | 
is a Subring. 


Example 4. The set of polynomials ag + a,X+...+a,X” in Z[X] with 
a, =0isasubring of ZLX]. 


SUBRINGS AND HOMOMORPHISMS 155 


Example 5. The set {0, 1, 2, 3} is a subring of (18, PLUS, TIMES). 


A<1i4 A/ ,TIMESLA;A]eA 
A/,PLUSLA;A]eEA 1 


We defined a homomorphism of groups to be a map that is compatible 
with the group operations. For rings we require that a homomorphism be 
compatible with both addition and multiplication and also that it map the 
identity correctly. Let R and S be rings. A homomorphism from R to 
S is a map f:R—S such that 1f= 1 and forallx and yinR we have (x ty)f= 
xf +yf and (xv)f = (xf) (vf). The map taking m in Zto (m, 0)in Z @ Zis 
compatible with both addition and multiplication. However, it does not map 
1 to the identity of Z ® Z. The map m+ >(m, m)isaring homomorphism of 
Z into Z @ Z. Since a ring homomorphism f is in particular an additive 
homomorphism, it follows that Of = 0. As with groups, a byective ring 
homomorphism is called a ring isomorphism. Two rings R and S are iso- 
morphic if there is a ring isomorphism of R onto S. 

The kernel of a group homomorphism is the set of elements in the 
domain that are mapped to the identity element. Let f:R—S be a ring 
homomorphism. Since our rings have two binary operations, each with 
its own identity element, we seem to have to choices for the kernel of 
f, either J = {x e R|xf =0} or J= {x = R|xf= 1}. A little investigation shows 
that both J and J are closed under multiplication, but only J is closed under 
addition. In fact, J has several other nice properties, and we define the 
kernel of f tobe L. 


THEOREM 1. Let f:R—S be a homomorphism of rings and let J be 
the kernel of f Then J is a subgroup of (R, +) and, for all x in J and r in 
R, both rx and xr arein 


Proof. Since f is a ring homomorphism, f is a homomorphism of the 
abelian group (R, +) into the group (S, +) and J is the kernel of this group 
homomorphism. Thus J is a subgroup of (R, +). If x e J andreR, then 
(rx) f = (rf) (xf) = (7f)0 = 0. Thus rx is in J Similarly, xrisinl U 


A subset J of a ring R that is an additive subgroup and contains all 
products rx and xr with x in J and r in R is called an ideal of R. Theorem 
1 states that thekernel of a ring homomorphism of R to S is an ideal of 
R. The sets {0} and R are always ideals of R. 

What are the ideals of Z? A subgroup of (Z, +) has the form nZ. It is 
trivial to verify that nZ is an ideal of Z for any integer nm and so, for Z, the 
notions of ideal and additive subgroup coincide. 

The following theorem states analogues of parts of Theorems 3.4.1 
and 3.5.3. 


156 RINGS 


THEOREM 2. Let f:R—-S be a ring homomorphism. Then 


(a) If U is a subring of R, then Uf is a subring of S. 
(b) If V is a subringof S, then Vf! is a subring of R. 
(c) If J is an ideal of R, then /f is an ideal of Rf. 

(d) IfJ is an ideal of S, then Jf~ is an ideal of R. 


Proof. See Exercise 11. 0 


Note that in Theorem 2c we can be sure /f is an ideal of S only when 
f is surjective. 

THEOREM 3. Let f:R—S be a surjective ring homomorphism with 
kernel J. There is a 1—] correspondence between the set of ideals of R con- 
taining / and the set of ideals of S. 


Proof. This is similar to Theorem 3.5.9 and follows easily from parts 
(c) and (d) of Theorem 2. 0 


Ideals seem to have a place in the theory of rings roughly equivalent 
to that of normal subgroups in the theory of groups. It is therefore natural 
to ask whether, given an ideal J of a ring R, we can define a quotient ring 
R/I. By the definition of an ideal, we know that J is a subgroup of the group 
(R, +), which is abelian. Therefore J is a normal subgroup of (R, +), and we 
can define the quotient group (R/J, +). In order to make R// into a ring, we 
need to define multiplication. Addition in R/J is defined by (x +D+(Qt+D= 
(x +y) +L. Thus it is reasonable to try to define the product of the cosets 
x +Jandy +J to be xy +J. But is this well defined? Ifx +/ =x’ +J and 
ytl=y’' +I, thenx’=x+t+uand y’ =y+yv withu andv inJZ Hence 


x'yl=(xtu)(ytv)=xy tuy +xv tuyv 


and, since J is an ideal, uy, xv, and uv are all in Z Therefore x'y’ + J] = 
xy + J and multiplication of cosets is well defined. The coset 1 + J is clearly 
a multiplicative identity. The remaining ring axioms are easily verified. For 
example, 


xt Dv tDNZ+tD) =a+D) z+ DN=xQyz)tl 
and 
(et NY tDIZ+tD=OythN(Zt+tD= (xyz +1. 
Since x(yz) = (xy)z in R, multiplication in R/J is associative. The distribu- 
tive laws are left for the reader to check. 
The following theorem is an immediate consequence of our definition 
of the ring structure of R/J. 


THEOREM 4. If J is an ideal of the ring R, then the natural map of 
R onto R/J is aring homomorphism with kernel L OU 


SUBRINGS AND HOMOMORPHISMS 157 


We have already seen one example of a quotient ring. For any posi- 
tive integer n the ring Z,, is the quotient ring of Z modulo the ideal nZ. 

The three isomorphism theorems for groups, Theorems 3.5.7, 3.5.12, 
and 3.5.13, have natural analogues for rings. 


THEOREM 5. Let f:R—S be a surjective ring homomorphism with 
kernel Z Then S is isomorphic to R/I. 


Proof. By Theorem 3.5.7, the map / taking x + J in R/J to xf in S is 
an isomorphism of (R/J, +) onto (S, +). This map also preserves products 
and maps 1+/JinR/J to 1inS. Thush isaring isomorphism. [ 

THEOREM 6. Let S be a subring of the ring R and let J be an ideal 
of R. Then SO J is an ideal of S, S + J is a subring of R, and 


S/S OL) = (S + DIL 


Proof. Let f:R—>R/I be the natural map. Then S + J = (Sf)f7 and 
so § + J is a subring of R by Theorem 2ab. In addition, f maps both S and 
S + J onto Sf with kernels S N J and J, respectively. Thus, by Theorem 5, 


S(SAND=Sf=(St+D/L 
THEOREM 7. Let J and J be ideals of the ring R with J C J. Then 
J/I is an ideal of R/I and (R/J)/(//D is isomorphic to R/J. 
Proof. See the proof of Theorem 3.5.13. U 


For any element a of an abelian group A, written additively, and any 
integer m we have defined an element main A so that the map mt-—ma is a 
homomorphism of abelian groups. If R is a ring and we take A = (R, +) 
and a = 1, then we get even more. 


THEOREM ‘8. If & is a ring, then the map from Z to R taking m to 
ml is a ring homomorphism. 


Proof. If m and are nonnegative integers, then the formula (m1) (1) = 
(mn)1 is proved either by a simple induction on m or by invoking Theorem 
1.4. The cases in which m or n is negative are then handled using the iden- 
tity (-m)1 =—-(ml). U 


Let R be a ring and let f:Z—>R be the homomorphism of Theorem 
8. The kernel of f is nZ for a unique integer n > 0. We call m the charac- 
teristic of R. The image of Z under f is a subring of R isomorphic to Z,,. 
Since nl = 0 in R, we have nx = (nl)x = O for any x in R. Of course, for 
rings of characteristic 0 this does not say anything new. 


THEOREM 9. Let R be an integral domain or a division ring. Then the 
characteristic of R is either O or a prime. 


Proof. Let 1 be the characteristic of R. Any commutative subring of 
R must be an integral domain. (Why?) Now R contains a subring isomorphic 


158 RINGS 


to Z/nZ, which is an integral domain if and only ifn =O ornisaprime. U 


COROLLARY 10. Let F be a finite field. Then |F| = p™”, where 
p is the characteristic of F, a prime, and m is a positive integer. 


Proof. Since F is finite, F cannot contain a subring isomorphic to 
Z. Thus, by Theorem 9, the characteristic p of F is a prime. As remarked 
before, px = O for all x in F. Thus every element of the finite abelian group 
(F, +) has order 1 or p. By the First Sylow Theorem, this implies that no 
prime other than p divides |F|. O 


As the next result shows, fields have very few ideals. 


THEOREM 11. Let R be a commutative ring with 1 #0. Then R isa 
field if and only if the only ideals of R are {0} and R. 


Proof. Suppose first that R is a field and that J # {0} is an ideal of 
R. Then J contains an element x # O. Since R is a field, x has an inverse 
in R and J contains xx“! = 1. Thus if 7 is an element of R, then r = 71 is 
in/. Hence J=R. 

Now suppose R has only the ideals {0} and R. Let x be a nonzero ele- 
ment of R. It is easy to show that J = Rx = {rx|r eR} isan ideal of R. (See 
Exercise 9.) Since J contains x, we have J # {0}. Therefore J = R and | is 
in J, This means that there is an element y of R such that yx = xy = 1. 
Hence x is a unit in R. Thus R is a commutative ring with 1 # 0 in which 
every nonzero element is a unit. That is, R isa field. 


We close this section with a brief description of a generalization of the 
concept of an ideal. A right ideal U of a ring R isa subgroup of (R, +) such 
that for all uw in U and allrinR& the product ur is in U. Similarly, a left ideal 
is an additive subgroup of R that is closed under multiplication on the left 
by elements of R. Ideals, as we have defined them, are both left ideals and 
right ideals. To emphasize this fact, an ideal is often referred to as a two- 
sided ideal. In a commutative ring the concepts of a left ideal, a right ideal, 
and a two-sided ideal are the same. 


EXERCISES 


In the following exercises R, S, and T are always rings. 


1 Suppose R is a subring of S and S is a subring of 7. Show that 
R is a subring of T. 

2 Prove that the intersection of any nonempty collection of subrings 
of R is again a subring of R. 

3 Let X be a subset of R. The subring of R generated by X is the 
intersection of all subrings of R containing X. Give another de- 
scription of S analogous to Theorem 3.3.3. 


Oo On nN 


16 


17 


18 
19 


20 
21 
*22 


23 


24 


SUBRINGS AND HOMOMORPHISMS 159 


Suppose f:R—S and g:S—>T are ring homomorphisms. Show 
that f o g is aring homomorphism. 


Let f:R—S be a bijective ring homomorphism. Prove that f~ is 
a ring homomorphism from S to R. 


Determine the units in Z[i]. 

Find a unit other than +1 in Z[,/2]. 

Show that nZ is an ideal of Z for any integer n. 

Assume that R is commutative and that x,,...,X,, are elements 
of R. Show that the set of elements of R of the formr,x, +r,x, + 
...+%yXm with each sr; in R is an ideal of R. 

How many ideals are there in the ring Z,,? 

Prove Theorem 2. 

Fill in the details of the proof of Theorem 3. 

Describe the ideal of Z[X] generated by 4, 2X, and X?. 

Prove Theorem 7. 


An automorphism of R is an isomorphism of R with itself. Show 
that the set Aut(R) of all automorphisms of R is a subgroup of 
x(R). 


Let m be an integer that is not a square in Z. Prove that the map 
taking a + b./m to a — b,/m is well defined and is an automor- 
phism of Z[,/m]. 

Show that Aut(Q) is trivial but that Aut(C) is nontrivial. 

Show that Aut(R) is trivial. 

Let F be a field of characteristic 0. Prove that F contains a subring 
isomorphic to Q. 

Describe the ideal of Z[i] generated by 1 +i 

Let J be a nontrivial ideal of Z[i]. Show that Z[i] // is a finite ring. 
Let z = a + bi be a nonzero Gaussian integer. Determine the order 
of Z[i]/I, where J is the ideal of Z[i] generated by z. 

Let P and T be W-by-W matrices such that R = (1M, P, 7) is a 
ring. Assume that 1 is the identity of R. Suppose the vector A lists 
the elements of a subset A of R. Write APL propositions correspond- 
ing to the following assertions. 

(a) A is asubring of R. 

(b) A is an ideal. 

For each integer J in 18 determine the subring of R = (18, PLUS, 
TIMES) generated by J and the ideal of R generated by J. 


160 RINGS 


3. COMPUTING IN RINGS USING APL 


Already in this chapter we have introduced many rings, and we will be de- 
scribing more examples as we go along. Frequently, we will need to compute 
with elements of these rings and to manipulate vectors and matrices with 
entries in these rings. This section is devoted to a discussion of some 
of the ways APL can be used to work with arrays that have entries in one of 
the rings Z, Z,, Q, R, C, or Z[i], as well as arrays with entries in a small 
finite ring described by its addition and multiplication tables. As the APL 
language is currently defined, arrays in APL have real numbers as entries. 
Thus, for Z,, C, and Z[i], we will first have to discuss methods for repre- 
senting arrays over these rings. 

Integer arrays are the easiest to work with in APL. As long as the en- 
tries remain small, say less than 10!°, calculations with integer arrays are 
performed exactly on APL terminal systems. Since the entires are real 
numbers, the primitive APL operations may be used to perform most arith- 
metic computations. 

A matrix A with entries in Z, can be represented by a matrix A of 
integers such that A[7;J] is a representative for the corresponding entry 
in A. The modulus n will always be denoted by the global variable 7. A 
given matrix A can be represented by infinitely many integer matrices. 
Two integer matrices A and B represent the same matrix over Z, if and 
only if A/,(M|A)=N|B or, equivalently, 4/, 0=N|A-B. If A repre- 
sents A, then B<W|A also represents A and B is the unique matrix repre- 
senting A that has all of its entries in the set {0,...,” — 1}. We will call 
B the standard representation of A. All procedures in CLASSLIB with 
the prefix ZW return standard representations. 

If the integer arrays A and B represent arrays A and B over Z, with 
the same shape, then the standard representations for —A, A +B, A — B, and 
AX Bare N|-A, M|At+B, N|A-B, and W|AXB, respectively. For con- 
venience, the procedures ZNNEG, ZNSUM, ZNDIFF, and ZNPROD have 
been included in CLASSLIB to compute these results. When JW is greater 
than 10’, then ZNPROD uses multiple-precision multiplication to com- 
pute AxB, since the entries in A xB may exceed the size of the largest integer 
that can be represented by the terminal system using single precision. If 
every entry in A is a unit in Z,, then the array of inverses of the entries 
in A is represented by ZVNINV A. The matrix of Mth powers of the entries 
in A is represented by A ZNPOWER M, 


N+1000001 A ZNPROD B 
A<+237938 791247 1 14 
1+B<ZNINV A A ZNPOWER 1000000 


936136 701616 405718 918192 


COMPUTING IN RINGS USING APL 161 


With the dyadic procedures ZNSUM, ZNDIFF, ZNPROD, and ZNPOWER 
arguments having one entry are expanded to match the other argument. 

The symbols for the APL primitive arithmetic operations indicate 
exact computations with real numbers. However, on a terminal system only 
a finite set of rational numbers can be represented and the results of the 
arithmetic operations are only approximations. For arrays whose entries 
may be irrational numbers we have little choice but to make do with the 
APL primitives and to remember that results may not be exact. Round-off 
errors often produce results that are nonzero numbers with small absolute 
value where the correct result is 0. Because of this, all procedures in 
CLASSLIB with the prefix R normally set to 0 all entries in an array that 
have absolute values less than EPSILON times the largest absolute value 
of the entries in the array, where EPSILON is a global variable normally 
set to 1043. The following example shows how this can be done. 


A<1EF4u 11F 4 1EF 10 
O<A<+Ax(|A)>EPSILONx[/,|A 
10000 1 0.0001 0 


With arrays over Q we have a choice. We can treat them simply as 
arrays over R and use the APL primitive operations. This is fast but often 
leads to incorrect results. An alternative is to use integer vectors of length 
2 to represent rational numbers. For example, if we really want to specify 
the number 1/3 and not some approximation such as 0.3333333333, then 
we can use the integer vector 1 3. In general, if P and @ are integers with 
Q#0, then the rational number P+@ can be represented by the vector P, Q. 
Addition and multiplication can be performed using the formulas 


c _ ad + be 
d bad ' 


)(G)- & 


which require only integer computations. 

A matrix A of rational numbers can be represented by a rank 3 in- 
teger array A such that ALZ;7;] gives the numerator and denominator 
of the corresponding entry in A. Thus 


7/6 a 
1/2 3 


+ 


would be represented by 


162 RINGS 


O<A+2 22076 3512341 


7 6 
3. ~«5 
1 2 
3 1 


The array A is unique if we assume that numerators and denominators are 
relatively prime and that denominators are positive. To improve readability, 
the procedure DAQ (for “display array of rationals’’) may be used to print 
out scalars, vectors, and matrices with rational entries. 


DAQ A 
7/6 ~— 3/5 
1/2 3/1 


The procedures QVEG, QINV, QSUM, QDIFF, QPROD, QQUOT, and 
QPOWER are the analogues of the monadic operations - and + and the 
dyadic operations +, -, x, +,and *. 


DAQ B+2 2 201.721 8315 


1/7 2/1 
8/3 1/5 
DAQ@ A QSUM B 
55/42 7/5 
13/6 16/5 
DAQ A QPROD B 
1/6 6/5 
~u/3 3/5 


DAQ A QPOWER 2 
49/36 9/25 
1/4 9/1 


Note that QPROD computes the entry-by-entry product and not the matrix 
product, which is discussed in Section 5. The five dyadic procedures allow 
arguments of different shapes, provided the arrays they represent, con- 
sidered as arrays of rational numbers, are conformable for scalar arithmetic 
operations. For example, to add 1/2 to each entry in the matrix represented 
by A, we form 


DAQ 12 QSUM A 
5/3 ~ 41/10 
1/1 7/12 


The fractions produced by the procedures with the prefix @ are always 
reduced to lowest terms. This involves computing many greatest common 


COMPUTING IN RINGS USING APL 163 


divisors. As a result, these procedures can use a significant amount of CPU 
time. 

To represent complex numbers in APL, we again use vectors. The 
complex number 3—2i can be represented by the vector 3 2 Of its real 
and imaginary parts. Matrices over C are represented by rank 3 APL arrays 
and are displayed using DARV (for “display array of real vectors’’). Thus 


1 .0-i 2.1+3.5i 
0.5 7 OF 


is represented by 


O<U+2 2 201 122.4 3.5 0.5 007 
1 1 
2.1 3.5 
0.5 0 
0 7 


which can be displayed as a matrix as follows. 


1 DARV U 
1.0 1.0 2.1 3.5 
ake) . 0 -O 7.0 


The first argument of DARV is the number of decimal places desired. The 
procedure DAZV (for “display array of integer vectors’’) produces the 
same result as DARV with a first argument of 0 and may be used to display 
arrays of Gaussian integers. 

The procedures CINV, CSUM, CDIFF, CPROD, CQUOT, and CPOWER 
perform the same operations with complex arrays that the corresponding 


procedures with the prefix @ perform for arrays of rational numbers. For 
example, 


7 1 DARV V+2 2 20 112.300 0.4 2.1 ° 3.2 
1.0 1.0 2.3 .0 


0 4 2.1 3.2 
1 DARV U CSUM V 
0 .0 4e4u 3.5 
5 4 2.1 3.8 
2 DARV U CPROD V 
00 2.00 4.83 8.05 
00 ~ ,20 22.40 14.70 


164 RINGS 


The five dyadic procedures allow arguments of different shapes. Thus, to 
multiply all entries in the array represented by U by 2 +i, we form 


1 DARV 2 1 CPROD U 
1.0 _ 4.7 9.4 
5 7.0 14.0 


There is no need for a procedure CNEG, since negation in C can be per- 
formed using the primitive APL negation operation. The procedures with 
the prefix C set to zero entries that are smaller in absolute value 
than EPSILON times the largest absolute value of any entry. 

The norm of a complex number z = x + yi, with x and y real, is N(z) = 
x? + y?, and the magnitude or absolute value of z is |z| = ./x* +y*, the 
nonnegative square root of N(z). The conjugate of z is z = x — yi. The pro- 
cedures CNORM, CMAG, and CCONJ compute the norms, magnitudes, and 
conjugates of the entries in a complex array. 


1 DARV U 
1.0 1.0 2.1 3.5 
25 . 0 0 7.0 
CNORM U 
2 16.66 
0.25 49 
CMAG U 
1.414213562 4.081666326 
0.5 7 
1 DARV CCONS U 
1.0 1.0 2.1 3.5 
0.5 . 0 .0 7.0 


Procedures with the prefix FR perform computations with arrays 
whose entries are in a specified finite ring R. A complete description of 
R requires four global variables. The addition and multiplication tables 
for R are assumed to be given by the matrices FRPLUS and FRTIMES. 
Origin 0 is assumed, and O and 1 must be the additive and multiplicative 
identities, respectively. The negative or inverse of J in the group (R, t+) 
is given by FRVEGLI]). If I isa unit in R, then FRINVLT] is the inverse 
of u. If 7 is not a unit, then FRITWVLI] must be O. We will refer to R as 
the current finite ring. 

The procedure “RI NIT may be used to initialize FRPLUS, FRTIMES, 
FRNEG, and FRINV. The arguments are the addition and multiplication 
tables for the finite ring. The procedure copies these arrays into FRPLUS 
and FRTIMES and then computes “RNEG and FRINV. To work with the 
ring R of order 8 discussed in Section 1, we enter 


COMPUTING IN RINGS USING APL 165 


PLUS FRINIT TIMES 
FRNEG 

04123 45 6 7 
FRINV 

0103 00 0 0 


To save time, KRINIT does not actually check to see whether its argu- 
ments really define a finite ring. 
Let us consider the matrices 


L<A<+2 203 2 6 i U<+B<+2 207 5 0 2 
3 
6 


On 
NO on 


2 
1 
to have entries in R. Then the sum and product of A and B are given by 
A FRSUM B A FRPROD B 
4y 7 5 2 
6 3 0 2 
The procedures “RDIFF and “RPOWER compute differences and powers. 


EXERCISES 


1 Experiment with the procedures ZVNEG, ZNSUM, ZNDIFF, 
ZNPROD, ZNINV, and ZNPOWER. 


2 Enter rank 3 integer arrays representing the matrices 


11/3 -2 3 iA 
A= , B= ; 
1/7 9/4 13/2 0 
Use DAQ to display these arrays. Experiment with QVEG, QINV, 


QSUM, QDIFF, QPROD, QQUOT, and QPOWER. 
3 Enter rank 3 arrays representing the complex matrices 


3—2i 140.5 7+3i —142i 
A= , B= . 
—4 3.21 2—Si 6+ 


Experiment with DARV, DAZV, CINV, CSUM, CDIFF, CPROD, 
CQUOT,and CPOWER. 
4 Let A be an integer matrix. Write APL expressions for the rank 3 


arrays that represent the same matrix of integers considered as a 
matrix of rational numbers and as a matrix of Gaussian integers. 


166 


5 


RINGS 


Show for any complex numbers z and w that the following iden- 
tities hold. 

(a) N(z)=2z =|z/?. (d) ztw=ztw. 

(b) N(zw) = N(z)NC(w). (e) zw=zw. 

(c) |z+w|<|z|+]w. 

Experiment with the procedures CNORM, CMAG, and CCONJ. 


Assume that “RPLUS and cRTIMES are the addition and multi- 
plication tables for a finite ring R with O and | as the additive and 
multiplicative identities. Write APL expressions for the vector 
FRNEG giving additive inverses and for the vector FRINV giving 
multiplicative inverses in R. 


Construct addition and multiplication tables for Z,. Set N+6 and 
set the current finite ring to Z, using FRINIT. Compare the 
speeds of execution of the procedures with the prefixes ZW and 
FR for some sample calculations with arrays over Z,. 


4. POLYNOMIAL RINGS 


In this section we will describe how, given a ring R, we can construct a new 
ring called the ring of polynomials with coefficients in R. In order to do 
this, we will need to give a formal definition of the term “polynomial’’. 
Since polynomials are introduced first in high school algebra and studied 
extensively in calculus, it may seem unnecessary to dwell very long on 
them here. The following dialogue between an instructor and an under- 
graduate student of algebra may help to explain why we are devoting a 
whole section to rings of polynomials. 


I: 


U: 


What is a polynomial with real coefficients? 

Why do you ask? I’ve been working with polynomials ever since 
high school, adding them, multiplying them, and even factoring 
them sometimes. I certainly know what a polynomial is! 


Well, if they’re that familiar, you should be able to give me a 
definition. 


: All right. A polynomial with real coefficients is an expression 


ag +a,X+...+a,X", where the a; are real numbers. 


Thank you. Let me write down some examples to see if I under- 
stand your definition. 


1+2X, 
1+X+X? +X3, 
1+2X +0X?. 


| 


| 


POLYNOMIAL RINGS 167 


That’s right. But the first and third polynomials are really the same. 
They’re different expressions. Why aren’t they different poly- 
nomials? 

Well, when some qa; is 0, we usually don’t bother to write down 
the term a;X". 

Oh. Can you tell me how to add polynomials? 

Sure. You just add corresponding coefficients. For example, 1 + 
X+X? plus2+3X+4X? is3+4X+5X?. 

Okay. Suppose I want to add the polynomials X and X !°°. What’s 
the coefficient of X 1 in X? 

Zero. 

Really? What’s the coefficient of X 1° in X? 

Zero. 

How about X 1000000 9 

Still zero! 

You seem to be saying a polynomial has infinitely many coef- 
ficients, one for each power X’ of X,i=0, 1,2,.... 

Yes, I guess I am. 

Now, can you tell me how to multiply polynomials? 


Well, I’m not sure I can give a general formula, but I can certainly 
work out an example. 

1+ X+2X? +4x3 

1—2X + 3X? 

1+ X+2X? +4xX3 

— 2X —2X? —4X3 — 8x4 
3X? +3X3 + 6X4 4+12x5 
l— X+3X? +3X3 — 2X4 +12x> 


It seems to me you could save a lot of time if you didn’t bother 
to write down all those X’s and just worked with the coefficients. 
Like this: 


l I 2 4 
1 —2 3 
l 1 2 4 
—2 -—2 -2 -8 
3. 3 #6 12 
1 -l 3 3 -2 12 


: That’s true in this example, but if you tried to multiply 1 + X!° + 


168 RINGS 


X* by X1° + X* your way, you would have an awful lot of 
zeros that I don’t have to bother with. In fact, I can do this product 
in my head my way. 

I: Good point. Let me summarize what you’ve said about polynomials. 
A polynomial is determined by the sequence dy, a,,... Of its 
coefficients. This sequence is an infinite sequence of real numbers 
that happens to consist of all zeros from some point on. To com- 
pute the coefficients of the sum or the product of two polynomials 
f and g, one needs only to be given the coefficients of f and g. 


U: Yes, Pll agree with that. 


Well, if a polynomial is determined by its sequence of coefficients 
and if the procedures for adding and multiplying two polynomials 
require only a knowledge of the coefficients of those polynomials, 
then I, as an algebraist, would be tempted to say that a poly- 
nomial is the sequence of its coefficients. Thus I would define a 
polynomial with real coefficients to be a function f from {0, 
1, 2, .. .} to R such that fi) = 0 for all sufficiently large values 
of i. 


| 


U: I think I see that you’re driving at, but I sure don’t think of poly- 
nomials that way! 


I: Frankly, most of the time neither do I. 


Let us formalize what the instructor and the undergraduate have been 
saying about polynomials ‘and, at the same time, broaden the concept of a 
coefficient to include things other than real numbers. Let R be a ring and 
let M be the set of nonnegative integers. A polynomial with coefficients in 
R is a function f:“@—R such that for some integer n, depending on f, 
f@ = 0 for i>n. For the moment, let us denote the set of all such poly- 
nomials by P. If f and g are in P, we define two new polynomials f + g 
and fg by 


(f +g) (kK) = fk) + 8(k), 


k 
(fg) (k) = > f@g(k — i). 
j= 
THEOREM 1. The set P with the operations of addition and multipli- 
cation is a ring. 
Proof. We will leave most of the details as an exercise. As an illustra- 


tion, we will prove that multiplication is associative. Suppose f, g, and h are 
allin P. Then 


POLYNOMIAL RINGS 169 
k 
[(fe)h] (k) = > (fg)(i) h(k — i) 


k i 
py (> f) ai—p) h(k — i) 


1=0 = 


l 


k 
> ff) eG — jf) hk —D. 


0 50 


Also, 


k 
Lf(gh)] = > fi) (gh) (k — i) 


J=0 


k k-i 

-y¥ ro ( 5 (i) miki) 
=0 

k 

> 


k-1 
> fg) h(k —i—j). 


It is now easy to see that both [(fg)h](k) and [f(gh)] (k) are equal to the 
sum of all possible products f(u)g(y)h(w), where u, v, w are nonnegative 
integers andu +y +w=k. Thus (fg)h= f(gh). Note that this proof, like the 
proofs of the other ring axioms, does not require that R be commutative. 
The additive identity for P is the sequence O, 0, 0, . . . and the multiplica- 
tive identity is 1,0,0,... U 


For r in R define f, to be the element of P such that f,(0) =7 and 
f/@ =0,i 21. 


THEOREM 2. The map rtf, is an injective ring homomorphism of 
R into P. 


Proof. It follows immediately from the definition of addition and 
multiplication in P that f, + f, = fr+s and ff, = fis. Since f; is the multi- 
plicative identity of P, the map rl -f, is a ring homomorphism. Now f, = 
0 if and only if r = 0, so this map is injective. 0 


If f is a nonzero polynomial in P, then the degree of f is the largest 
integer n such that f(n) # 0. By the definition of a polynomial, f(i) #0 
for only finitely many values of i, and so n exists. We write n = deg(f). If 
f has degree n, then f(n) is called the leading coefficient of f. The degree 
of the zero polynomial is not usually defined but sometimes it is said to be 

THEOREM 3. Suppose R is an integral domain. If f and g are non- 
zero elements of P, then fe # 0. Thus P is an integral domain. Moreover, 


170 RINGS 


deg(fg) = deg(f)t+deg(g), and the leading coefficient of fg is the product 
of the leading coefficients of fand g. 


Proof. Let m = deg(f) and n = deg(g) and let k be in M. Then 
(fg) (k) = > f(ig(k — i). 


Ifk >m+n and 0 <i < k, then either i >m or k —i>n. Thus (fg) (k) = 
O. Similarly, we see that (fg) (m + n) = f(m)g(n). Since R is an integral 
domain, f(m)g(n) # 0, and so deg( fg) = m +n and f(m)g(n) is the leading 
coefficient of fe. O 


We have introduced polynomials as infinite sequences to avoid having 
to define what is meant by an “‘expression” aj + a,X +...+a,X”" and 
to say when two such expressions represent the same polynomial. Our 
definition also allows addition and multiplication in P to be described 
easily. It is now time to admit that we will normally use the more familiar 
notation ag +a,X +...+a,X” and variants of it, such asa, X" +...+ 
ay, to denote polynomials. But what is X? Usually X is referred to as a 
variable or indeterminant, with no attempt made to define what these 
terms mean. We can give another answer. The symbol X stands for a par- 
ticular element of P: the function f:M—R such that f(1) = 1 and f(i) = 0, 
i # 1. It is easily shown that in P the nth power X” of X is the polynomial 
g such that g(n) = 1, g(@i) = 0, i #n. If we identify 7 in R with the corres 
ponding constant polynomial f, defined previously, then we may consider 
R to be a subring of P. When this is done, we may write a polynomial f in 
Pas f(0O) + f(l)X +...+f(n)X", where n is any integer such that f(i) = 0 
for i >n. Such an expression is a sum of products of elements in P and, as 
such, is unambiguous. 

When we denote the polynomial 0, 1, 0, .. . by X, we usually write 
P as R[X]. If, for some reason, we prefer to use Y to stand for the poly- 
nomial 0, 1,0, ..., then we refer to P as R[Y]. One situation where this 
is done is the case in which the coefficient ring is itself a ring of polynomials. 
Having started with R and constructed the ring P of polynomials with co- 
efficients in R, we might wish to consider the ring Q of polynomials with 
coefficients in P. If we have chosen to let X be a particular element of 
P, we cannot use X to stand for the polynomial 0, 1, O,...in Q. Thus we 
write O=P[Y] =R[LX][Y] =R[X,Y], not O=R[X][X]. 

Let f be a polynomial in R[X]. According to our formal definition, 
the expression f(i) represents the ith coefficient of f This notation is in 
conflict with notation used in calculus. There we are taught that if f = 
2— X + 3X, then f(2) means 2 — 2 + 3(2?) = 12, not the coefficient 
3 of X*. Since we have already agreed to adopt the familiar notation f = 


POLYNOMIAL RINGS 171 


ag +a,X +...+a,X" for polynomials, it should not come as a surprise 
that from now on (except in certain exercises at the end of this section) 
whenever b is in R we will mean by f(b) the element dg + a,b +...+ 
a,b” inR. 

THEOREM 4. Let R be a commutative ring and let b be an element 
of R. The map ff -f(b) is a ring homomorphism of R[X] into R. 


Proof. Let f and g be in R[X]. The equality (f + g) (b) = f(b) + g(d) 
holds even without the assumption that R is commutative. However, to 
prove that (fz)(b) = f(b)g(b), we need to know that (cb’) (db’) = cdb tt, 
and this requires that R be commutative, or at least that b commute with 
every element in R. Since the identity 1 +OxX +...of RLX] is mapped to 
lin R, the map ff-f(b) isa ring homomorphism. [ 


The map ft f(b) in Theorem 4 is called evaluation at b. Here b is fixed 
and f ranges over R[X]. We can also assume that f is fixed and obtain a 
map b}-f(b) of R into R. Such a map is called the polynomial function 
on RK defined by f In calculus, it is not necessary to distinguish between 
polynomials and polynomial functions, since two polynomials f and g in 
R[X] define the same polynomial function of R into R if and only if f= 
g. However, when R is a finite ring, different polynomials may define the 
same polynomial function. For example, Z, has two elements, so there are 
exactly 2? = 4 functions from Z, to Z,. The ring Z, [X] is infinite, and each 
of the functions from Z, to Z, is defined by infinitely many different 
polynomials. For instance, the identity function on Z, is defined by each 
of the polynomials X, X?,¥ +X? +X3,andX? +X74+X". 

The library CLASSLIB contains procedures for manipulating arrays 
of polynomials with coefficients in the rings Z, Z,, or R as well as in small 
finite rings. A polynomial ag + a,X +...+a,X” in R[X] is represented 
by the vector of its coefficients. A matrix A with entries in R[LX] is repre- 
sented by a rank 3 array A in which AL I;2; ] lists the coefficients of the 
corresponding entry in A. Thus 


_ [0.1 + 2x 1 — x? 
X? 2.34+X+X? 


O+4+2 2 300.1 20340 4100312.3 11 
0.1 2 0 


1 0 1 


is represented by 


© 


0 1 
2.3 1 1 


172 RINGS 


The procedure DARV may be used to display A in a more convenient manner. 


1 DARV A 
1 2.0 . 0 1.0 .O0 1.0 
.0 .0 41.0 2.3 1.0 1.0 


The procedures RXSUM, RXDIFF, and RXPROD compute entry-by- 
entry sums, differences, and products of arrays of real polynomials. For 
example, 


2 DARV B+2 2 3p 11004 104100 .73 0 
“41.00 1.00 .00 .00 41.00 1.00 
.00 1.00 .00 . 00 .73 .00 


pa [it oe 
X 0.73X | 


We can compute A +B, A —B,andA X B as follows. 


represents the matrix 


2 DARV A RXSUM B 


~.90 3.00 .00 1.00 1.00 2.00 
.00 1.00 1.00 2.30 1.73 1.00 
2 DARV A RXDIFF B 
1.10 1.00 .00 1.00 1.00 .00 
.00 1.00 1.00 2.30 .27 1.00 
2 DARV A RXPROD B 
~,10 1.90 2.00 .00 .00 00 1.00 1.00 1.00 1.00 
.00 .00 .00 1.00 .00 .00 1.68 73 .73 .00 


The procedure RXDEGREE computes the degrees of the polynomials 
represented by its argument. 


RXDEGREE A RXDEGREE 0 0 0O 


NO 
KO BO 


The value returned by RXDEGREE for the zero polynomial is 1. This is 
a nonstandard definition. The procedure RXZEAD calculates the leading 
coefficients of an array of real polynomials. 


_ U<C<+RXLEAD A RXLEAD 0 0 0O 
2 1 1 
1 1 


POLYNOMIAL RINGS 173 


Here CLI;J7] is the leading coefficient of ALIT;7; J. Note that RXLEAD 
considers the leading coefficient of the zero polynomial to be 1, another 
nonstandard definition. Recall the statement in the previous section that 
entries in real arrays that are smaller in absolute value than EPSILON 
times the largest absolute value of an entry in the array are considered to be 
0. This applies to RXSUM, RXDIFF, RXPROD, RXDEGREE, and RXLEAD. 

Arrays of polynomials in Z[X] are best manipulated using the pro- 
cedures with the prefix ZX but procedures such as RXSUM and RXPROD 
may be used as long as all entries remain less than +EPSILON. 

Arrays of polynomials in Z,[X] are described by arrays of integers 
that represent the coefficients. The modulus 7 is, as always, the global 
variable V. To multiply 2 + 5X + 3X? + 9X3 and 7+ 10X + 2X3 + 7X4 
in Z,,[_X], we enter 


N<1it1 
25 3 9 ZNXPROD 71002 7 
305 94 8 6 8 


The result is 3 + 5X? +9X3 + 4X4 + 8X° + 6X° + 8X7. The procedures 
LNXSUM, ZNXDIFF, ZNXDEGREE, and ZNXLEAD are used in a similar 
way. 

If R is the current finite ring described by the four arrays FRPLUS, 
FRTIMES, “RNEG, and “RINV, then computations in R[X] can be per- 
formed using the procedures prefixed by “RX. Here are some examples in 
which R is the ring with 8 elements described in Section 1. 


PLUS FRINIT TIMES 
DAZV C+2 2 291 15231474 


1 1 9 2 
3 1 7 4 

DAZV D+2 2 2p7 04 162 5 5 
7 0 4 4 
6 2 9 5 

DAZV C FRXSUM D 
6 1 1 3 
9 3 2 1 

DAZV C FRXPROD D 
7 7 0 0 5 2 
6 4 2 7 7 0 


We can evaluate polynomials using the procedures RXE VAL, ZNXEVAL, 
and “RXEVAL. For example, to evaluate all of the polynomials in the 
preceding matrix A at 3.7, we calculate 


174 RINGS 


1 DARV A 
1 2.0 .0 1.0 .O 1.0 
.0 Oo 1.0 2.3 1.0 1.0 
A RXEVAL 3.7 
7.5 12.69 
13.69 19.69 


To evaluate each entry of A at a different point, we make the second argu- 
ment of RXEVAL a matrix. 


O<X<«2 203.7 1.21 0.6 


3.7 1.2 
1 0.6 
A RXEVAL X 
7.9 0.44 
1 3.26 


In general, RXEVAL evaluates each polynomial in its first argument at the 
corresponding entry of its second argument. Arrays with one entry are 
expanded in the usual manner. The procedures ZVXEVAL and FRXEVAL 
evaluate polynomials over Z, and the current finite ring, respectively. 


EXERCISES 


1 Complete the proof of Theorem 1. 
2 Let A and B be vectors listing the coefficients of two polynomials 
f and g in R[X]. Write an APL expression for a vector listing the 


coefficients of f + g. Allow for the possibility that A and B have 
different lengths. 


3. The vectors A~+1 2 3 and B+1 1 2 4 list the coefficients 


of two polynomials f and g in R[X]. The coefficients of fg are 
the column sums of the matrix 


D 
1 1 2 4 0 0 
@) 2 2 mn 8 @) 
@) @) 3 3 6 12 
The matrix 
U<+E<+Ao.xB 


MATRIX RINGS — 175 


is closely related to D. Write an APL expression constructing D 
from £. Use this to give an APL definition for the product of 
two polynomials. 


4 We have defined the polynomial ring in two variables R[X,Y] to 
be R[LXJ[Y]. Let M = {0, 1, . . .}. Show that we can construct 
R[X, Y] as the set O of all functions f:M X M—R such that 
fi, 7) = O whenever i or j is sufficiently large. Define carefully 
the operations of addition and multiplication in Q. 


*5 A polynomial in RLX, Y] can be represented by a matrix whose 
I,/Jthentry is the coefficient of X Ty’ If A and B represent poly- 
nomials f and g in RLX, Y], write expressions for matrices repre- 
senting f+ g and fg. 


6 Experiment with the procedures for polynomial manipulation 
in CLASSLIB. 


7 Let J be an ideal of the ring R and let J be the ideal of RLX] gen- 
erated by J. Describe J and show that (R/J) [X] is isomorphic 
to R[LX]/J. 

8 Find a quadratic polynomial f in Z,[X] such that f(a) = 0 for 
alla in Z,. 

9 Find a function from Z, to Z, that is not defined by any poly- 
nomial in Z, [LX]. 

10 Let f:R—S be a ring homomorphism. Show that the map taking 

ag ta,X+...+a,X" inR[X] to fag) + fla, )X+...+ f(q,)X” 
in S[.X] is a homomorphism. 


5. MATRIX RINGS 


This book has been written with the assumption that most readers are 
already familiar with matrix multiplications for matrices with entries in 
R. In this section we will review the definition of matrix multiplication and 
extend it to matrices with entries in an arbitrary ring. 

Let R be a ring and let A and B be matrices over R, that is, matrices 
with entries in R. Assume that A is an &-by-m matrix and B is an m-by-n 
matrix. The matrix product of A and B is the 2-by-m matrix C such that 


m 
Cy= Aik Be; - 
k=1 


The matrix product is only defined when the number of columns of the 
first factor is equal to the number of rows of the second. If A:andB are 


176 RINGS 


real matrices, then in APL notation the matrix product C of A and B is 
defined by the condition CLIT;J7J=+/ALI;]xBL;J] so that C is the inner 
product A+. xB. For example, 


O<A+2 302 1414051 
2 1 4 
0 5 1 
O<+B<+3 401 8 34 26132715 


12 38 3 25 
8 37 6 20 


The matrix product of A and B is usually denoted simply as AB. If 
it is necessary to use an explicit symbol for matrix multiplication, we will 
borrow the symbol +. from APL. As noted earlier, the expression A X B 
will always denote the entry-by-entry product of two arrays with the same 
shape. 

Matrix multiplication satisfies conditions that may be viewed as gen- 
eralizations of the associative law for multiplication and the distributive 
law in rings. 


THEOREM 1. Let A, B, and C be &-by-m, m-by-n, and n-by-r matrices, 
respectively, over the ring R. Then the products (AB)C and A(BC) are 
defined and are equal. 


Proof. The matrices AB and BC are &-by-n and m-by-r matrices, re- 
spectively. Thus the products (AB)C and A(BC) are defined. Moreover, 
m 


[A(BC)];; = x Ajs(BC)s; 


m n 
= > Ais > BstCz; 
s= t=1 


MATRIX RINGS 177 


n 
= > (AB)irCy; = (AB)C]y. O 
t=1 

THEOREM 2. Let B and C be m-by-n matrices and let A and D be 
2-by-m and n-by-r matrices, respectively, all over the ring R. Then 


A(B+C)=ABtAC, 
(B+C)D =BD + CD. 


Proof. All of the indicated products are defined. The proofs of these 
equalities are similar to the proof of Theorem 1 and are left as exercises. L 


It should be noted that the proofs of Theorems | and 2 do not require 
R to be commutative. 

If A and B are both n-by-n matrices over R, then the sum A + B and 
the product AB are also n-by-n matrices. It follows immediately from 
Theorems 1 and 2 that the set M,(R) of all n-by-n matrices over R is a ring 
under the operations + and +.%x. The zero element is the n-by-n matrix 
of zeros and the multiplicative identity is the matrix J such that J;; = 0, 
i#jand/;;=1,1<is<n. Here is an example in M,(Z). 


O<+T<+(12)°.=12 T+.xA 
1 0 a | 
01 1 3 
O<A+2 294 71 3 A+.xI 
a To | 
1 3 1 3 


The group of units of M,(R) is denoted GL,(R) or GL(m, R) and is 
called the n-by-n general linear group over R. In later sections we will de- 
vote considerable attention to the problem of deciding which elements 
of M, (R) are in GL, (R), particularly when R is commutative. 

It is convenient to extend the operation +. to pairs of arrays other 
than matrices. This will be done in exactly the same way that +. is defined 
for real arrays in APL. Commonly, one factor will be a matrix and the other 
will be a vector. If A is an m-by-n matrix and V and W are vectors of length 
m and n, respectively, then VA and AW are the vectors with components 


yn 
(VA); = > V;jAj;, 
1=1 


n 


(AW); = > Aj; W; . 
J] 


178 RINGS 


The transpose B of a matrix A is the matrix obtained from A by inter- 
changing rows and columns. Thus B;; = A;;. We usually write B = At. In 
APL notation this would be B< QA. 


THEOREM 3. Let A and B be &-by-m and m-by- matrices, respec- 
tively, over the commutative ring R. Then (AB)‘=B' A’. 


Proof. We have 
[(ABY 1; = (AB); 


m 
= > Aj Bxi.- 
k=1 


Also, 


[B°A‘];; > (B’)ixn (A) x; 


m 


> BriAjx - 
k=] 


Since R is commutative, the ijth entries of (AB)‘ and B’ A! are the same. {J 


The primitive APL operations +. and & can be used to compute 
products and transposes of matrices with entries in Z or R. In CLASSLIB 
the procedures with the suffix MATPROD perform matrix multiplication 
with matrices over Z,, Q, Z, [X], Z[X], and R[X], as well as finite rings. 
For example, to compute the product 


2/5 —7/3 3 5/7 
4/9 l 2/3 —1/8 
in M, (Q) we can enter 


DAQ A+2 2 202 5 73494141 


2/5 7/3 
u/9 1/1 

DAQ B+2 2 203 15723 +14 8 
3/1 5/7 
2/3 1/8 

DAQ A QMATPROD B 
~16/45 97/168 

2/1 97/504 


In M,(Z,; [X]) we can compute the product 


1+X+3X2 14+2X 2+X 3 
3+ 4X X 2X+X? 44+X 


MATRIX RINGS 179 


as follows. 


DAZV C+2 2 391 1312034 001 ~0 
3 1 2 0 

0 O 1 0 

DAZV D<2 2 3p2 30300021 «4 14 «0 
2 3 0 3.0 0 

2 1 4 1 0 

«3 

DAZV C ZNXMATPROD D 

22 4 1 22 1 0 

124 1 4 1 1 0 


(oe) 
FR 


Thus the answer is 


24+2XK + 4X2 + X3 24+2X+X? 
1+ 2X +4X2 +.x3 A+ X+X2 | - 

For ZNMATPROD and ZNXMATPROD the modulus WV must not exceed 10’. 

The procedure TRAV (for “‘transpose array of vectors’’) can be used 


to compute transposes of arrays with entries in Q, Z, [X], Z[X], or R[X]. 
For example, 


DAQ A DAQ TRAV A 
2/5 77/3 2/5 u/9 
u/ 9 1/1 7/3 1/1 


If R isthe current finite ring described by the arrays FRPLUS, FRTIMES, 
FRNEG, and FRINV, then products of matrices over R can be calculated 
using the procedure “RMATPROD., 

Suppose now that R is any ring. We can form the two rings S = 
M,(R([X]) and T = M,(R)[X]. Here elements of S are matrices with poly- 
nomial entries and elements of T are polynomials with matrix coefficients. 
What relationship, if any, holds between S and 7? It is possible to define 
a map from S to T as follows. Let F be in S. The ijth entry Fj; of F is a 
polynomial in R[X]. Let F;;, denote the coefficient of X* in Fj;. For a 
fixed k, let Az be the element of M,,(R) whose ijth entry is Fj;,. The poly- 
nomial 


f=Ayp tA,X+A,X* +... 
isan element of 7. As an example, we may take R = Z, n = 2, and 
3X +X ee 


~1+3X+4X? 4 — Xx? 


180 


RINGS 


Then the corresponding element of M,(Z) [X] is 


3 | 2 7 , + 
f= + X + X?, 
1 4 3 0 4 | 


THEOREM 4. For any ring R and any positive integer n, the rings 
M,(R([X]) and M, (R)LX] are isomorphic. 


Proof. We leave it as an exercise to verify that the map Fl -f just 
defined is aring isomorphism. U 


EXERCISES 

1 Prove Theorem 2. 

2 Verify that the ring axioms hold for M,(R). 

3 Compute the following products in M,(Z,). Use ZNMATPROD 
to check your answers. 

(a) |6 5 2 5S (b) |3 4 1 $5 
4 1;/6 1] 1 2 3 5 

4 Compute the following product in M,(Z[LX])and use ZXMATPROD 
to check your answer: 

3—X? 1+X —2 xX? 
1X X+3X? 1-X 2+3X 

5 Prove Theorem 4. 

6 Suppose R is a commutative ring and A and B are matrices over 
R such that AB is defined. Show that for ally in R we have (rA)B = 
A(vB) = r(AB), where rA is obtained from A by multiplying all 
entries by /. 

7 Let f:R—S be a ring homomorphism and let g:M,(R)—M, (S) 
be defined so that the image of a matrix A in M,(R) is obtained by 
applying f to the entries in A. Show that g is a homomorphism. 

8 State Theorem 3 using APL notation. 

9 The ring R = M,( Z,) has 16 elements. Construct addition and 


multiplication tables for R as follows. Let 


A<16 2 2pQ(492)T116, 


Then A has shape 16 2 2 and the 16 matrices ALJ; ; ] represent 
the 16 elements of R. Construct 16-by-16 matrices P and 
T (for ‘‘plus’”? and ‘“‘times”) such that 2|ALZI;;J+ALJ;33] 
is ALPLIT;JJ]3;;Jand2|ACI;;1]+.xALJ;;JisALTLI3sJ7])33]. 


DETERMINANTS 181 


10 Let A and B be matrices with entries in a ring such that the product 
AB is defined. Show that the ith row of AB is the product of 
the ith row of A with B and the jth column of AB is the product 
of A with the jth column of B. 


11 Suppose A, B, C, and D are matrices over a ring R with shape 
m-by-r, m-by-s, n-by-r, and n-by-s, respectively. By the block 


matrix 
A B 
M= 


we mean the (m + n)-by-(r + s) matrix obtained by arranging the 
entries of A, B, Cand D in the indicated pattern. 


(a) Give a formula for M[i;j]. 
(b) Give an APL expression describing /. 


12 Show that block matrices may be multiplied asif they were matrices 
with matrix entries. That is, suppose 


A B S T 
M= and N= 
C D U V 
are block matrices over a ring. Prove that 
AS+BU AT+t+BV 
MN = , 


CS+DU CT +DV 


provided the indicated products are defined. 
13. Let R be a ring and let m and n be positive integers. Prove that 


M,, (M,,(R)) is isomorphic to M,,,(R). 
6. DETERMINANTS 


Let A be a square matrix with entries in the commutative ring R. The 
determinant of A is a certain element of R that we denote det A. For ex- 
ample, if A is the 2-by-2 matrix 


then det A = ad — bc. Sometimes we write 


b 
d 


det A = 


tf 


replacing the brackets used in displaying matrices by vertical straight lines. 
For example, 


182 RINGS 


7 -I 
4 2 


In this section we will define the determinant of any square matrix with 
entries in R and establish some basic properties of the determinant func- 
tion. Additional facts about determinants will be proved in the next section. 

Determinants are usually encountered first in the solution of simul- 
taneous linear equations. Suppose a, b, c, d, e, f are real numbers and we 
wish to solve the equations 


= (7) (2) — (-1) (4) = 18. 


ax + by =e, 
cx +dy =f. 


Multiplying the first equation by d and the second by b and subtracting, 
we obtain 


(ad — bc)x =ed — bf. 
If ad — bc #0, then 


e b | 

_ed—bf _|f d 

ad — be a b 

C d 

Similarly, we find that 
a e 
_le of 
y ~ a b ’ 

C d 


again assuming the denominator is not zero. 
The determinant of the 3-by-3 matrix 


“a b e 
de f 
g hii 


is defined to be 
aei + bfg + cdh — ceg — afh — bdi. 


This formula can be remembered using the following diagram. 


DETERMINANTS 183 


+ + 4 
a b cw] La  b 
a e?<y d° e 


The terms with the plus signs are the products along the three diagonals 
sloping down and to the right. The terms with the minus signs are the 
products along the diagonals sloping up and to the right. Using the for- 
mula, we see that 


3 | -2 
2-4 0O |=(3)(-4) (-3)+ 0) ©) (1) + (-2) 2) G) - 
lo 5 -3 (—2) (—4) (-1l) — () (0) (5) — (I) (2) (3) 


Warning 


The ‘“‘obvious” generalization of this 


scheme does not work for matrices 
larger than 3-by-3. 


The determinant of an n-by-m matrix is a sum of n! terms, half of which 
have a plus sign and half of which have a minus sign. To describe these 
terms, we use elements of the symmetric group 2, and the homomorphism 
sgn of 2, into {1l, —1}, defined in Section 3.8. Permutations in 2, will 
be denoted by lower-case Greek letters. It will be convenient to use APL 
notation for indexing matrices so that A;; will be written A [i;j]. The index 
Origin will be assumed to be 1 throughout this section. 

Let A be an n-by-n matrix over the commutative ring R. The deter- 
minant of A is the element of R given by the expression 


n 


detA= > (sgno) II Ali;iol, 
o i=] 
where the sum is over all elements o of 2, . The product 
n 
II Ali;io] 
i=1 


has exactly one factor from each row of A and, because o is a permuta- 
tion, exactly one factor from each column of A. 
If n = 2, then 2, has two elements. If o = (1) (2), then the corre- 


184 RINGS 


sponding term in det A is (+1)A[1;1JA[2;2]. With o = (1, 2), we get 
(—1)A[1;2]A[2;1]. Thus 


det A =A[1;51JA[232] —A[1;2]A[251], 


which agrees with the formula given for the determinant of a 2-by-2 matrix. 
Let us consider a somewhat larger example. We will compute the de- 
terminant of the integer matrix 


O<«A+4 4p 314 34 34 10°1413124342 1 
3 


4 
1 
2 

directly from the definition. First, set 


OTO0<1 
S*+GPSGN P+GPSYMG 4 


The rows of FP list the 24 elements in 24, and SLI] is the sign of PLZ; ]. 
If o is the [th row of P, then the Jth term in det A is the product of SLT] 
and 


Al13;:10]xAL2;20]x... = AL1;PLI2;1]]xAC2;PLI;2])]x... 
Let 


B<+2 1 2AL;3P] 


Then BLI;J7] is AL7J;PLI;/]] and the Ith term in the determinant of 
A is SLI]xx/BLI;]. Thus the 24 terms of det A are given by the com- 
ponents of the vector Sxx/B. Therefore det A is 


+/Sxx/B 


181 


We can also write det A as S+.xx/B. 

Although our APL formulation of the definition of the determinant 
is valid for any square real matrix, typical workspace limits do not permit 
its use for matrices larger than 4-by-4. 

As defined, the determinant of an n-by-n matrix has n! terms, each 
with n factors. This would seem to make the determinant of a large matrix, 
say a 10-by-10 matrix, very difficult to compute. Actually, the determinant 
of an n-by-n matrix with entries in R or Z,, p a prime, can be computed 


DETERMINANTS 185 


in a time proportional to n?, which grows much more slowly than n!. In 
CLASSLZIB there are procedures for calculating determinants of matrices 
with entries in various commutative rings. The procedures ZDET, ZNDET, 
RDET, RXDET, ZNXDET, and rRDET compute determinants of matrices 
over Z, Z,, R, RLX], Z, [X], and the current finite ring, respectively. The 
algorithms used by most of these procedures are discussed in Chapter 6. 
However, “RDET uses the method described in Exercise 7.19. When ZNDET 
is used, the modulus WV may not exceed 10’; when ZVXDET is used, NW must, 
in addition, be prime. It should also be noted that “RDET does not check 
whether the current finite ring is commutative. The procedure MPZDET 
computes the determinant of an integer matrix as a multiple-precision 
integer. It can be used when the value of the determinant or the result 
of some intermediate calculation exceeds the precision of an APL scalar. 
See Exercise 15. 

By way of an example, let us use ZDET to compute the determinants 
of the preceding matrix A and of its transpose. 


ZDET A ZDET QA 


~181 181 


As the next theorem proves, the fact that A and &A have the same deter- 
minant is no accident. 


THEOREM 1. If A is any square matrix over R, then det A’ = det A. 
Proof. Let B = A‘. From the definition of the function det, we have 


det A = > (sgn o) I] Ali;io], 
o i=1 

detB= > (sgn7) II Blj;iz] 
T j=l 


= > (sgn7) Il Alitsil, 
T J= 


where o and 7 range over 2,,. Fix i and o for a moment and define j to be 
io. Then i = jo and A[i;io] = A[jo1,;j]. If we keep o fixed and let i vary 
from 1 ton, then] ranges over the set {1,...,m}. Thus 


Il Afi;io] = Il Afjo";y]. 
i=] 


j= 


Since sgn o 7! = sgn o (see Exercise 3.8.9), we have 


186 RINGS 


n 
detA = > (sgno) II Aljo™;j]. 
o j=1 


If we write 7 for 07! and note that 7 runs over 2, aso does, we obtain 
n 
det A= > (sen7) II Affr;s] 
T j=1 
=detB. OU 


A square matrix A is said to be upper triangular if A[i;j] = O for 
i > j. This is equivalent to saying that all entries below the main diagonal 
are 0. The matrix 


N<U<+3 302 1008200 3 


2 1 0 
0 8 2 
0 O 83 


is upper triangular. 


THEOREM 2. If A is an upper triangular matrix, then det A is the 
product of the entries of A on the main diagonal. 


Proof. Let o be in 2,. If o is not the identity permutation, then for 
some i in {l,...,m} we have io <i. To see this, suppose io > i for all 
i. Let (i,,...,i,) be a nontrivial cycle of o. Theni, <i, <... <i, <i,. This 


means 7, = i,, which cannot be the case. Thus if o is not the identity, the 
product 


ni 
II Ali;iol 
i=1 
contains a factor below the main diagonal of A and hence the product is 
O. Therefore 
ni 


detA= [|] Alizi]. O 


i=1 
The computation 


ZDET U x/1 18U 


4 8 48 


verifies Theorem 2 in a particular case or provides a check on the correct- 


DETERMINANTS 187 


ness of the coding of ZDET, depending on one’s point of view. As a special 
case of Theorem 2, we see that the determinant of the n-by-n identity 
matrix is 1. Lower triangular matrices are defined in the obvious way, and 
Theorem 2 holds with “‘upper’’ replaced by “‘lower’’. 


The next result is one of the most important properties of the de- 
terminant. 


THEOREM 3. Let A and B be square matrices of the same size with 
entries in R. Then det(A +.XB) = (det A) (det B). 

Proof. This theorem states that det is a homomorphism from the 
monoid (M,(R),+.x) to the monoid (R, X). There are many different 
approaches that can be taken in proving this fact. The one chosen here 
is the most straightforward, although perhaps not the most elegant. Using 
the definition of the determinant to evaluate each side of the equality 
to be verified, we obtain two somewhat complicated expressions that must 
be proved equal. This is done by showing that a great deal of cancellation 
occurs in one of the expressions. 

Let C= At+.XB. Then 


det C= > (sgn o) II Cli;io] 
o i=1 


and 


n 
Clisio] = X Alis/]Blj;io. 
j=l 
If in Theorem 1.4 we let a; = Cli;io] and by = Ali;j/|Blj;io], then we get 
n n 
II clisiol = ¥ I] Alsip Blip;iol, 
i=1 i=1 


p 


where p ranges over the maps of {l, ..., m} into itself. For a fixed o and 
p, let 


d(o,o)= II Ali;ip|Blip;io}. 
i=] 
Then 
detC= > (sgno) > d(o,p). (*) 
o p 


Now let us expand (det A) (det B). We have 


188 RINGS 


n 


detA= > (sgno) II Ali;iol, 
o i=1 


n 


det B= ¥(senr) II Bliss], 
j=1 


T 


where o and 7 run over 2,,. By the distributive laws (see Exercise 1.17), 
we have 


(det A) (det B)= ¥ (sgno) (senz)( II. Alisiol) ¢( I] Biysir)). 
O,T i= j=1 


Now (sgn o) (sgn 7) = sgn o7. Also, if we write j = io, then j runs over {1l,..., 
n} asi does and 


n n 
II Bij;jr] = II Blio;io7r]. 
j=l i=1 

Thus 


n 


(det A) (det B) = > (senor) [J Al[i;jo]Bl[io;ior}]. (**) 
oO ,T i=] 


The sum (*) has n!n” terms, while the sum (**) has (n!)? terms. To 
prove that these two sums are equal, we must show that a great deal of 
cancellation occurs in (#’). To do this, we need a lemma. 


LEMMA 4. Let p be a map of {l,..., m} into itself that is not a per- 
mutation. There is a transposition @ in 2, such that @p = p. For any such 
0 we have d(o,o) = d(@0,0) for all o in Z,. 


Proof. Since p is not a permutation, there exist i and j with i #j and 
ip = jp. If we take @ to be the transposition (7,7), then 6p = p. By definition, 


n 


d(60,0)= I] Alizip]Blip;ido] 
i=1 


(TI Alsip) (II B ip:i06l) 
i=1 i=] 


(1 Ati (11 B@p:i00) 
i=] 


i=] 


Asi ranges over {1,...,n}, so doesi@. Thus 


DETERMINANTS 189 


d(60,p) = I] Ali;ip | I] Blip;io | 
i=1 i=1 


Il Ali;ip|Blip;io] = d(o,p). O 


i=1 
If p and @ are as in Lemma 4 and o is in 2,, then sgn 00 = —sgnoa, 
since sgn 86 = —]1. Thus d(o,o) and d(@0,p) occur with opposite signs in 


(*) and therefore cancel. Since < @ > has order 2, we see that the sum 
of (sgn o)d(o,p) aS o runs over a right coset of < @ > in 2, is 0. Hence 


> (sgn o)d(o,p) = 0 


unless p is in 2, . Therefore 


det C= > (sgno) > d(o,p), 
o p 


where now the sum on p is over 2, . If for a given o and p we define 7 in 
x, by o = pr, then 


detC= > (sgno) > I] Alisip1Blip;iol 
o p 1=1 


n 


= > (sgnor) I A[i;ip]Blip;ipz] 


p,T 
and so, by (**), 
det C= (det A) (det B). O 


Theorem 3 has an important corollary that gives a necessary condi- 
tion for a matrix to be a unit in M, (R). 


COROLLARY 5. If A is a unit in M, (R), then det A isa unit in R and 
(det A7!) = (det A). 


Proof. Since AA™! = J, we have (det A) (det A7!) = det J = 1, and so 
det A~! is the inverse of det A inR. O 


We will continue our study of the units in M,,(R) in the next section. 


EXERCISES 
1 Show that the formula 


190 


10 


RINGS 
abe 
det| d e f | =aeitbfe+cdh— ceg -afh — bdi 
g hii 


is consistent with our general definition of the determinant. 


Evaluate the following determinants by hand and check your 
answers using ZDET. 


(a) |7 -2 (b)} 3-1 2 
4 3 0 4 -3 
5 1 -2 


Compute the determinant of the matrix 
1+3X+2Xx? 44+X+3X? 
2+ xX? 3X +2X | 
in M,(Z, [X]) and check your result using ZVXDET, 


At what points in the proofs of Theorems | and 3 was the com- 
mutativity of R used? 


Verify the identity det(A+.xB) = (det A) (det B) of Theorem 3 
in a particular case by direct calculation using the matrices 


9 3 4 (4 1 2 
A=|7 0 2], B=] 1 3-5 
1-6 5 -1 7 4 


Suppose A is a square matrix with entries in R, and suppose A has 
a row of zeros. Show det A = 0. 

Let A be a square matrix with entries in R and letr be in R. Prove 
that multiplying all the entries in a row of A by r multiplies the 
determinant of A by r. 

Let f:R—S be a homomorphism of rings and let A be a square 
matrix with entries in R. Suppose B is the matrix with entries 
in S obtained by applying f to the entries of A. Show that det B = 
(det A)f 

Choose a random 4-by-4 integer matrix by executing 


A<+(?4 4011)-6. 
Select a value for the modulus W and check that V| ZDET A isthe 
same as ZNDET A. Explain. 


Let A+(?6 60911)-6. Compare D+RDET A and E+ZDET A. 
Note the execution times for each computation. The procedure 


11 


12 


13 


14 


15 


UNITS INMATRIX RINGS 191 


ZDET takes longer because it uses only operations that yield in- 
teger results at each step. 


Set OT0<0, W+11, and A<+?3 3 3011. We will consider A to 
represent an element of M@,(Z,,[X]). Let F<ZWXDET A. Select 
various values of X modulo 11 and verify that 

ZNDET A GNXEVAL X 
is the sameas # ZNXEVAL X. Explain. 
Initialize the current finite ring to Z, as in Exercise 3.8. Let 
HT0<+0, N+6, and 4+?4 4p6. Compare the execution times for 
evaluating ZVWDET A and FRDET A. Make further comparisons 
for larger random matrices over Z,. 
Let A+(?10 100201)-101. Compare the results and execu- 
tion times of RDET A and MPZDET A. 


Let A be an n-by-m real matrix and suppose every entry in A has 
absolute value not exceeding a. Show that 


|det A| < n'a” < (na). 
[It is possible to prove the stronger result | det A|< (a,/n)’ .] 
Suppose A is a square integer matrix and we know a real number 


c such that |det A| < c. Assume that p,,..., Dp, iS a sequence 
of distinct primes such that 2c <<p,p>...Dr. Show that det A can 
be determined if we know the determinant of the matrix over 
Z,, tepresented by A, 1 < i < r. (The procedure MPZDET uses a 
bound similar to the one given in Exercise 14 to compute c. Then 
ZNDET is used to compute the determinant modulo 
BIGPRIMESLT] 
for sufficiently many values of J.) 


7. UNITS INMATRIX RINGS 


Let R be a commutative ring. The primary goal of this section is to obtain 
a description of the group GL, (R) of units in the ring M,(R) of n-by-n 
matrices with entries in R. Along the way we will prove some additional 
facts about determinants and introduce the concept of a row operation. 
Besides being useful here, row operations will be needed in Chapter 5, and 
they will play a fundamental role in the algorithms developed in Chapter 
6. As in the previous section, the index origin will be assumed to be 1. 

A row operation over R is a particular kind of function that maps a 
matrix with entries in R to another matrix with the same shape. There are 


192 RINGS 


three types of row operations. In the following descriptions of the types 
of row operations, A will be assumed to be an m-by-n matrix over R. 


Type 1. Choose an integer i between 1 and m and a unit u of R. The row 
operation O,(i,u) maps A to the matrix B, where B is identical with A 
except that the ith row B[i;] of B is u times the jth row of A. For ex- 
ample, if R = Z and 


LI+B<+A<+3 3019 B 
1 2 3 1 2 83 
4 5 6 “uo 5) «6G 
7 8 Q9 7 8 9 


BL2;1]<-BL2;3] 


then B is obtained from A by applying O, (2,—1). 

Type 2. Choose two different integers i and j between 1 and m. The row 
operation O,(i,j) maps A to the matrix C obtained by interchanging the 
ith and jth rows of A. Continuing the previous example, we may apply 
O,(1,3) to A as follows. 


C<A C 
CL1 33;4+CL3 1;] 


WO WO 


7 8 
4 9 
1 2 


Type 3. Choose distinct integers i and j between | and m and an element 
r of R. The row operation O3(i,j,r) maps A to the matrix D obtained by 
adding r times A[i;] to A[j;]. That is, D[j;] = A[j;] + rA[i;] and D[k;] = 
A[k;] fork #j. To apply 0, (3, 2,4) to A, we form 


D<A D 
DL2;1]+DL23;1+4xDL3;] 1 #2 38 
32 37 42 
7 8 9 


We will write row operations to the left of their arguments. Thus if 
O is a row operation and A is a matrix, then the result applying O to A will 
be denoted OA. Row operations may be composed to form new maps. 
If O’ is another row operation, then (OO')A is O(O'A). 


THEOREM 1. The following identities hold for row operations. 


(a) O,(i,u)O,(i,v) = O, (i, uy). 
(b) O, (i,j)? =e, the identity function. 


UNITS IN MATRIX RINGS 193 


(c) 030,7,103G,7,5) = O03@,j,r +5). 

Proof. We will prove part (c), leaving parts (a) and (b) as exercises. 
Suppose B is obtained from A by applying O;(i,j,s) and C is obtained from 
B by applying O,(i,j,r). If k #j, then C[kK;] =B[k;] = A[xX;], and 

Cli;] = Bs) +rBli;] =A] + sAli;s) +rAlis) =A) +@+s)Alil. 
Thus C is the image of A under O3(i,j,r +s). U 


Not all row operations over R have the same domain. For example, 
O,(i,j) is defined only for matrices over R having at least m rows, where 
m is the larger of i and j. 


THEOREM 2. Each row operation over R is a permutation of its do- 
main. Moreover, 


(a) O,(i,u)* =O,(,u™). 
(b) O24,7)* = 02G,/). 
(c) O3(,j,r)* =03G,/,—r). 
Proof. If u is a unit in R, then by Theorem la we have 
O,(i,u)O,G,u")=0,0,1)=0,(0,u)O,(i,u). 
Since O,(i,1) is the identity function on the set of matrices with at least 
i rows, it follows from Corollary 1.3.4 that O,(i,u) and O,(i,u7!) are 


inverse permutations. Parts (b) and (c) are derived in a similar manner 
from the corresponding parts of Theorem 1. U 


The next theorem shows the relationship between row operations and 
matrix multiplication. 


THEOREM 3. Let O be a row operation over R and let E be the result 
of applying O to the m-by-m identity matrix. If A is any matrix over R with 
m rows, then OA is the matrix product £+.x A. 


Proof. Set B = E+.XA. We must consider three cases corresponding 
to the possible types for O. Suppose first that O = O,(i,u). Then E[k;2] = 0 
for k #2. Thus 


B[k;2] = > Elk3t]A[t;2] = ELK; k]A[K;2]. 
t 
If k #i, then E[k;k] = 1 and B[k;] = A[k;]. Also, Bli;] = Eti;ijA[i;] = 
uA[i;], and so B = OA. 
Next suppose O = O,(i,j). Then E[k;k] = 1 if k 4i,j, Eli;sj_=£;i] = 
1, and all other entries in £ are O. If kK #i,/, then 


BUk; 2) = > Elk;t] Alt; 2] = ELK; kJA[K;2] = A[k;2]. 


t 


194 RINGS 
Also, 
BUisQ) = > Elist]Ales 21 = EAs 21 = AI. 
t 


Similarly, B[j;] =A [i;] and, hence, B = OA. 
Finally, suppose O = O;3(i,j,r). Then, for all k, we have E[k;k] = 1 
and E[j;i] =r. All other entries in E are 0. Thus, if k #j, then 


B(k;2]) = > Elk;t]A[t;2] = ELK;K]A[K;2] = A[K;Q]. 
f 
In addition, 


BUsQl = > EUV3t]Ale;2] = ALi32] +A [i321]. 


t 
Therefore B[j;] =A[j;] +rAl[i;] andB=OA. O 


Any matrix £ that is the result of applying a row operation to an 
identity matrix is called an elementary matrix. The type of elementary 
matrix is the type of the row operation used to construct it. Theorem 3 
states that a row operation may be applied to a matrix by multiplying the 
matrix on the left by the appropriate elementary matrix. Let us verify 
Theorem 3 in a particular case. In the prior example we obtained D as the 
image of A under O;(3,2,4). The corresponding elementary matrix is con- 
structed as follows. 


BH<+(13)0°.=13 
EL2;1]<EL2; ]+4xF£[33] 


The calculation 


O<+F<+H+.xA A/,F=D 
4 2 3 4 
32 37 42 
7 8 g 


shows that indeed D is 7+. x A. If 


UNITS INMATRIX RINGS — 195 


H1i<E2<(14)°.=14 
ZH1C2;J< 3xF1L2;] 
EF2C1 3;J<F2L3 1;] 


E1 
1 0 0 0 
0 3 0 90 
0 oO 1 O 
0 0 oO 1 

B2 
001 0 
040 0 
1000 
0001 


then F1 and £2 are the 4-by-4 elementary matrices corresponding to the 
row operations O,(2,—3) and O,(1,3), respectively, over R. 


THEOREM 4. Every m-by-m elementary matrix E is a unit in M,, (R). 
Moreover, E~! is also an elementary matrix. 


Proof. Let FE = OJ, where O is a row operation and J is the m-by-m 
identity matrix. Set O’ equal to the row operation that is the inverse of 
O and set E’ = O’L Then, suppressing the symbol +. xX for matrix multi- 
plication, we have 


I=(00')i = O(O'l) = E(E'TI =EE’. 
Similarly, E’E = J, and so E and E’ are inverses of each other in M,,(R). OU 


In general, the elementary matrices generate a proper subgroup of 
the group of units in M,,(R). However, it will turn out that, for many 
familiar rings R, every unit in M,,(R) is a product of elementary matrices. 

Our next task is to compute the determinants of the various types of 
elementary matrices. First, we define a generalization of elementary ma- 
trices of type 2. Let o be an element of 2,,. The permutation matrix cor- 
responding to o is the m-by-m matrix B = B(o) such that Bli;j] = 1 if 7 = 
io and B[i;j] = O otherwise. For example, if m = 4 and o = (1,3,2,4), then 
B(o) is the matrix 


—- OO © 
or OC O 
ooo 
oOo, Oo 


If o is the 2-cycle (i,j), then B(o) is just the m-by-m elementary matrix 
corresponding to O,(i,/). 


196 RINGS 


THEOREM 5. Let o be in 2,, . Then det B(o) = sgno. 
Proof. Let B = B(o). Then, by definition, 


detB= > (sgn7) I] Blizirl. 
T i=] 


Since B[i;i7] = O unless iv = io, the only nonzero term in det B occurs when 
7T=0. Thus 


m 
det B = (sgno) Il Blizsic] =seno. O 
i=1 
THEOREM 6. Let £ be the elementary matrix obtained by applying the 
row Operation O to the m-by-m identity matrix. 
(a) IfO =0,(,u), then det £ = u. 
(b) If O =O, (i,j), then det E = —1. 
(c) IfO =03(i,j,r), then det £ = 1. 


Proof. (a) If O = O,(,u), then E is upper (and also lower) triangular. 
Therefore det E is the product of the entries E[k;k], which are all 1 except 
for E[i;i] = u. Thus det E = u. (b) If O = O, (i,j), then i #7 and EF = B(o), 
where o = (i,j). By Theorem 5, det EF = sgno = —1. (c) If O = 03(,/,7), 
then F is either upper. triangular or lower triangular, according to whether 
i >j ori <j. In either case, det E is the product of the diagonal entries 
in FE, which are all 1. Hence det F=1. O 


THEOREM 7. Let B be the matrix obtained by applying the row 
operation O to the m-by-m matrix A. 

(a) IfO =0O,(i,u), then det B = u(det A). 

(b) If O =O, (i,j), then det B = —det A. 

(c) IfO =0;3(i,j,r), then det B = det A. 

Proof. Let E be the m-by-m elementary matrix corresponding to 
O. Then B is E+.XA and so, by Theorem 6.3, we have det B = (det F) 
(det A). The theorem now follows from Theorem 6. 


The following corollary is an important consequence of Theorem 7. 


COROLLARY 8. If two rows of a square matrix A over R are the same, 
then det A = 0. 


Proof. Assume A[i;] = A[j;] and i #/. Applying O3(i,j,—1) to A, we 
obtain a matrix B whose jth row consists entirely of zeros. By Theorem 7 
and Exercise 6.6, det A =detB=0. Q 


So far we have dealt only with row operations. One can also define col- 
umn operations in a completely analogous manner. We leave it as an exer- 


UNITS INMATRIX RINGS — 197 


cise to formulate and prove the theorems about column operations that cor- 
respond to our results concerning row operations. We will make use of the 
following analogues of Theorem 7b and Corollary 8. If A is a square matrix, 
then interchanging two columns of A changes the sign of det A. If two 
columns of A are equal, det A=0. 

The 3-by-3 determinant 


a bee 
de f |=aeit+bfg+cdh — ceg — afh — bdi 
g Ah i 


can be written as 


de 


a(ei — fh) — b(di — fg) + c(dh — eg) =a eh 


e d 
f\_, f 
h g i 
It will be useful to obtain similar expansions for any determinant. 
Let A be in M,(R). By definition, det A is the sum of the n! terms 


n 


(seno) I] A[k;ko]. 
k=] 


re 


Which of these terms contains a particular entry A [i;j] as a factor? Clearly, 
the answer is the terms that correspond to permutations o with io = j. Since 
each term in det A contains exactly one factor from the ith row of A, we 
have 


detA= > Afisj] > (seno) I] Alk;ko}. 
j=1 io =] k#i 


Define C to be the n-by-n matrix with 


Clijl= > (seno) II Al[k:ko}. 
lo=] kK#1 


Then, for 1 <i <n, we have 


ni 
detA= > Al[is/]C{i:/]. 
j=1 
We call C[i;j] the ijth cofactor of A and C the matrix of cofactors of A. 
Since each term in det A has exactly one factor from the jth column of 
A, we also have 


detA= > Ali;j] 3S (seno) I] Alk;ko] = > Alisf1Clissl. 
i=] lo=] k#i i=] 


198 RINGS 


Thus det A is the sum of the products of the entries in any one row or 
column of A with the corresponding cofactors. 

By its definition, C[i;j] does not depend on the entries in A[i;] or 
A[;j]. Fix i and j and let A’ be the matrix obtained from A by setting all 
entries in A[i;] and A[;j] to 0, except for A[i;j], which is set equal to 1. 
For example, if 


3 -] 4 2 

] 2 — 
A= 5 3 
7 2 -—2 ] 
8 -6 3 2 

and i = 3,7 = 2, then 

3 0 4 2 

1 _ 
A'= 0 5 3 
0 ] 0 O 
8 QO 3 2 


Let C’ be the matrix of cofactors of A’. Then 
n 
detA'’= > A'[ist]C’ [Et] = CU]. 
t=1 


However, C’[iij] = Cli;sj], since C[i;j] does not depend on the entries 
in the ith row or jth column of A. Thus C[i;j] = det A’. 

We have seen that if we multiply A[i;j] by the cofactor C[i;j] and 
sum over j, then the result is det A. Suppose, instead, we multiply A [i;/] 
and C[k;j] with k #7 and sum over j. What do we get then? To help answer 
this question, let us evaluate 


4 
> Al23;71C14;/], 
J=1 


where A is the preceding 4-by-4 matrix. Let 


3 -l 4 2 

] 2 5 -3 
B — 

7 2 —2 |] 

1 2 5 -3 


Here the fourth row of A has been changed so that it is equal to the second 
row. Let D be the matrix of cofactors of B. We wish to compare the entries 


UNITS IN MATRIX RINGS 199 


C[4;7] and D[4;j]. Take 7 = 2 as an example. By our previous discussion, 
C[4;2] is the determinant of the matrix 


3 0 4 2 
1 O 5 -3 
7 O -—2 1 
O 1 0 02 


But D[4;2] is the determinant of the same matrix, and so C[4;2] =D[4;2]. 
In the same way we see that C[4;/] = D[4;7], 1 < 7 < 4. Since B[4;j] = 
A[2;j7], we have 


4 4 
>» Al2/|Ci4] = > Bl4:1D[4;/] = det B. 
J=1 j=1 
But det B = 0 by Corollary 8. Thus 
4 
> A[2;/]Cl[4;/] =0. 
J=1 


THEOREM 9. Let A be an n-by-n matrix and let C be the matrix of 
cofactors of A. If k #i andj #28, then 


> Alist]C(k;t] = Alts/1Clt;21= 0. 
t=1 


f=] 


Proof. Let B be the matrix obtained from A by replacing A[k;] with 


A[i;]. Just as in our example, we see that C[k;t] is the ktth cofactor of 
B,so 


n 


> AlistlCtk:t] = > Blk;t]Clk;t] = det B=0. 
t=] 


t=1 


Similarly, we find that 
ni 
x Alt]Clt:21 =0. O 
For any square matrix A, let A* be the transpose of the matrix of 
cofactors of A. We call A* the adjoint of A. 


THEOREM 10. If A is a square matrix, then AA* = A*A = dl, where 
d= det A and J is the identity matrix. 


Proof. Let n be the number of rows in A and set P= AA*. Then. 


200 RINGS 


n n 
Plisk] = D Alist]A*(tsk] = d AlistClkse1, 
t=1 t=1 
where C is the matrix of cofactors of A. Thus P{i;i] = det A =d and, by 
Theorem 9, P[i;k] =0Oifi#k. Therefore P = dl. 
Nowsset Q = A*A. Then 


O[g:jl= Dd A* (Vs Ales]= DY Aless1Cle:21. 
t=1 t=1 


Thus Q is also equal tod/. OU 
We can now give the promised description of the units in M,(R). 


THEOREM 11. Let A be an n-by-n matrix over a commutative ring 
R, Then A isa unit in M,(R) if and only if det A isa unit in R. 


Proof. If A is a unit in M,(R), then d = det A is a unit inR by Corol- 
lary 6.5. Now suppose d isa unit. Then, by Theorem 10, 


A(d 1A*)=d1AA*=d'dl=I1=(d1A*)A. 
Therefore d7!A* is a two-sided inverse for A in M,(R) and A isaunit. 0 


We now wish to show that each cofactor of an n-by-n matrix A is, up 
to sign, the determinant of an (n—1)-by-(n—1) submatrix of A. We begin 
an example using the matrix 


3 -l 4 2 
A= 1 2 5 -3 
7 2 —2 1 
8 -6 3 2 


We know that the cofactor C[3;2] of A is the determinant of the matrix 


3 0 4 2 
ga {i 0 5 -3 
0 1 0 0 
8 0 3 2 


Let B be the 3-by-3 matrix obtained by deleting the third row and second 
column of A. Thus 


3 4 2 
B=]1 5 -3 
8 3 2 


Then det A’ = —det B. We could verify this fact by direct calculation of 


UNITS INMATRIX RINGS — 201 


det A’ and det B. However, we will use another approach that will give 
us an understanding of why the result holds. 

If we interchange the second and third columns of A’ and then inter- 
change the third and fourth columns, we obtain the matrix 


3 4 2 #20 
1 5 -3 0 
0 0 O0 O 1 
8 3 2 O 


Each column interchange reversed the sign of the determinant, and so 
det D = det A’. Now, if we interchange the third and fourth rows of D, 
we get 


3 4 2 #40 
] _ 

E- 5 —-3 0 
8 3 2 £O 
0 O O 1 


and det E = —det D. Note that the submatrix of E consisting of the first 
three rows and first three columns is B. Now 
4 
det E= > (sgno) I] E[k;kel, 
o k=1 


where o ranges over 2,. If 40 #4, then E[4;40] = 0. Thus we need only 
sum over those o with 4o = 4. If 40 = 4, then the restriction 7 of o to {1,2,3} 
is an element of 2, and the map o| 7 is a 1—1 correspondence. More- 
over, since the sign of a permutation depends only on the number of cycles 
of even length, sgn 7 = sgn o. Since E[4;4] = 1, we have 
3 3 
detE = > (senr) I] Elk;kr] = X (sgnr) I] Bik;kr] = det B. 
r k=1 T k=1 


Thus C[3;2] = det A’ = det D = —det E = —det B. 

Now let A be any n-by-n matrix over R. We define the ijth minor of 
A to be the determinant M[i;j] of the (n—1)-by-(n—1) matrix obtained 
by deleting the ith row and jth column of A. 


THEOREM 12. Let A be a square matrix and let C and M be the 
matrices of cofactors and minors of A, respectively. Then 


Clisf] = (-1)"*/M (i; /]. 
Proof. The preceding example contains all the essential ideas of the 
general case. Let A’ be the matrix obtained from A by setting A [i;j] equal 


202 RINGS 


to 1 and all other entries in A[i;] and A[;j] equal to 0. Let B be obtained 
from A by deleting the ith row and jth column. Then C[i;j] = det A’ and 
M[i,j] = det B. Performing n — j column interchanges and then n — i row 
interchanges transforms A’ into the matrix E, which has the form 


0 


0 

0...01 
We have det A’ = (—1)"~7(-1)”~ ‘det E = (—1)?”~ '~/det E = (—1)'""! det E, 
since i + j = 2n—i—j (mod 2). Now, by an argument analogous to the pre- 
vious one, det E = det B. Thus C[i;j] = det A’ = (—1)'"/ det E = (-1)'"/ det B = 

(-1)"mM[isj]. O 

We can now write the determinant of the n-by-n matrix A in terms 
of the determinants of n submatrices of A of size (n—1)-by-(n—1). For 


any i and / we have 
n 


detA= > (-1)*'Ali:t]M[i: 1], 


t=] 
nt . 
detA= S (-1)Alts/]MUt;/). 
t=1 
These formulas are referred to as the expansion of det A by the minors of 
the ith row and the jth column, respectively. Again using our example 


3 -] 4 2 
A ] 2 5 -3 
7 2 -—2 ] 
8 -6 3 2 
and expanding det A by the minors of the third row, we get 
—] 4 2 3 2 
det A =(—1)*7 2 5 —3 | +(-1)2 ] 5 -—3 
-6 3 2 8 3 2 
3 —] 2 3 -] 4 
#(-1)8(-2) ] 1-2 -3 #71} 1 2° 5 
8 -6 2 8 -6 3 


7(109)—2(—121)—2(—60)—1(—17) 
1142. 


UNITS IN MATRIX RINGS 203 


The formula A! = (det A)!A* can be used to compute inverses 
in M,,(R). For example, if 
A= , 
c.6uUOCUd 


then the matrices M, C, and A* are given by 


d C d —c 
M= , C= , 

b a —b a 
Ate d —b 

—c al 


Thus, if det. A = ad — bc isa unitin R, we have 


A“ = (ad — be)" ‘ a 


—C a 


If n is large, however, this method requires the evaluation of a great many 
determinants and is quite time consuming. If R happens to be the integers, 
a field, or the polynomial ring F'[X] over a field, then there is a much more 
efficient algorithm for computing inverses of matrices. This algorithm will 
be discussed in Section 6.2. 

The monadic primitive APL operation H computes approximate in- 
verses in M,,(R). 


U<+A<+2 203 27 6 


O+B<+HA 
1.5 0.5 
1.75 0.75 
A+.xB 
1.000000000£0 4.440892099F 16 
“2.994801444F 15 1.000000000F0 


Note that although the value printed for B is the exact inverse of A, the value 
stored in the computer is not exact, so A+.xB is not quite the identity 
matrix. 

The procedures ZMATINV and ZNMATINV compute inverses in M, (Z) 
and M,,(Z,, ), respectively. With ZVMATINV the value V of mmaynot ex- 
ceed 107. 


204 RINGS 


O+C+2 203 27 5 


3 2 
7 5 
O<D«+ZMATINV C 
5 (2 
7 83 
C+.xD 
1 0 
0 1 
O<+E<+2 292 1 4 3 
21 
4 3 
N<s 
O<F<ZNMATINV EF 
4 2 
3 1 
E ZNMATPROD F 
1 0 
0 1 
EXERCISES 


1 What are the results of applying each of the integer row oper- 
ations O,(3,—1), O,(1,2), and 03(1,3,—2) to the matrix 


7 —2 1 
U=|}-3 2 6]? 
4 -—] -5 
2 Considering the matrix 
3 l 6 
P={| 2 0 4 
] 5 3 


to be an element of M,;(Z,), apply each of the row operations 
O, (2,4), 0O,(2,1), and 03(2,3,3) to P. 

3 Complete the proof of Theorem 1. 
Complete the proof of Theorem 2. 


5 Construct the‘elementary matrices in M,(Z) corresponding to the 
row operations O,(3,-1), O,(1,3), and O3(1,3,—2). Compute 
the products of these elementary matrices with the matrix U in 
Exercise 1. Compare the results with your answers to Exercise 1. 


10 


12 


13 


14 


UNITS IN MATRIX RINGS 205 


Compute the inverses of the elementary matrices obtained in 
Exercise 5. 


Show that the map taking an element o in 2, to the permutation 
matrix B(o) is an injective homomorphism of 2, into the group 
GL, (Rk). 


Suppose SIGMA is a vector that is a permutation of 1. Write 
an APL expression for the permutation matrix corresponding to 
SIGMA. 

Let A be in M,,(R) and let o be in 2, . Define B to be the n-by-n 
matrix such that B[i;j] = Al[io;j]. Show det B = (sgn o) det A. 
Suppose C is the matrix of cofactors of the square matnx A. 
Show that C’ is the matrix of cofactors of A’. 

Let A be an N-by-N matrix and let I and J be elements of 1 1. Write 
an APL expression for the matrix B obtained by deleting the 
Ith row and Jth column from A. 

Compute the determinant of the matrix U in Exercise I, first using 
the expansion by minors of the third row and then using the ex- 
pansion by minors of the second column. 

For each of the following pairs of a ring R and a matrix A in 
M,,(R), show that A is a unit in M,(R) and use the formula A = 
(det A) !A* to compute A. 


(a) R=Z, A ; | 


7 8 
b = 3 
(b) R=Q, Ae | 
2 7 
(c) R=Z, 1 3 1 
A=] 1 2 O 
-1 -5 2 


(d) R=Z,, A= Pe | 
5 2 


(e) R=ZIX], ,_ pts Xx? 
—4 1 — 2X 
Define column operations. Show that a column operation can be 


applied to a matrix A by applying a row operation to A’ and taking 
the transpose of the result. 


206 RINGS 


15 Show that every elementary matrix can be obtained from the 
identity matrix by a column operation. (Thus there is no need to 
distinguish between ‘“‘row elementary matrices” and ‘‘column 
elementary matrices”) 


16 State and prove the analogues of Theorems 1, 2, 3, 6, and 7 and 
Corollary 8 for column operations. 


17 Show that column operations commute with row operations. That 
is, suppose O is a row operation and O’ is a column operation. 
Prove that applying O and then O’ gives the same result as applying 
O' and then O. 


18 Let A and B be square matrices over R. Show that the determinant 


of the block matrix 
A C | 
0) B 


is (det A) (det B). Here C is any matrix over R with the appro- 
priate shape and O denotes a matrix of zeros. 


19 Let A be an n-by-n matrix with entries in R. Let P denote the set 
of all subsets of S = {l,...,n}. For Yin P, let d(Y) be the de- 
terminant of the submatrix of A consisting of the entries in the 
first r = | Y| rows and the columns whose indices are in Y. If Y = 
{Vi,--+,yYrp andy; <y, <...<y,, show that 


a(Yy= % (-1)*’Alrsyj]d(¥—{9:)). 
i=] 


Explain how to compute det A = d(S) by starting with d(Q) = 1. 
Show that the total number of arithmetic operations involved 
is less than some constant times n2”. (This is the method used in 
FPRDET.) 


20 Use monadic H, ZMATINV, or ZNMATINV, as appropriate, to 
compute the inverses of the matrices in Exercise 13a—d. 


8. FIELDS OF FRACTIONS 


Let R be a ring. In Section 2 we noted that the map 4:Z—>R taking m in 
Z to mlin R is a ring homomorphism. We defined the characteristic of 
R to be the nonnegative generator of the kernel of h. Thus, if R has char- 
acteristic 0, then h/ is an injection and the image of Z under h is a subring 
of R isomorphic to Z. If R happens to be a field, we can say even more. 


FIELDS OF FRACTIONS 207 


THEOREM 1. If F is a field of characteristic 0, then F contains a 
subfield isomorphic to Q. 


Proof. Let h:Z—~F map m to ml. We would like to extend h to a 
map of Q into F. Each element of Q can be written as a/b, with a and 
bin Zand b #0. By assumption, h is injective, and so h(b) # O. It is there- 
fore reasonable to try to map a/b to h(a)h(b)"! = h(a)/h(b) in F. But is 
this well defined? Suppose a/b = c/d, with a, b,c, d all in Z and b and 
d not 0. Then ad = be and hence h(a)h(d) = h(b)h(c). The element h(b)h(d) 
of F is not 0, so we may divide by it to obtain h(a)/h(b) = h(c)/h(d). Thus 
alb F>h(a)/h(b) does define a map h’ of Q into F. For all a/b and c/d in 
Q, we have 


n' (4 n c) _ >! (“A e) _ h(ad + bc) 


b od bd h(bd) 
_ AA) +hOG)nc) _ A@ , AC) 
7 h(b)h(d) — A(b) hh) 


h'(a/b) + h' (c/d). 
Also, 


1(4@. C\_ 4,fac\ _ hac) _ h(@h(c) 
n' (Fx 5) bd) = Iba) 7 h(b)h(d) 


~ W@) » RO _ pray! 
= by) ay 7 Baby" (e/d). 


Therefore h’ is a ring homomorphism. If h’ (a/b) = 0, then h(a)/h(b) = 0 


and so h(a) = 0. This implies that a = 0. Thus the kernel of h’ is trivial 
and the image of Q under h’ is a subfield of F isomorphic toQ. UO 


Theorem 1 states that any field that contains a copy of Z also con- 
tains a copy of Q. The purpose of this section is to obtain a similar result 
for any integral domain. An embedding of one ring R in another ring S is 
an injective homomorphism of R into S. Suppose R is an integral domain. 
It is reasonable to ask the following questions. 


1. CanR be embedded in a field F? 


2. If so, can we choose F’ in such a way that whenever R can be 
embedded in a field K, then F can be embedded in K? 


We will be able to answer both these questions affirmatively. 
THEOREM 2. Every integral domain can be embedded in a field. 


Proof. Let R be an integral domain. In constructing a field Ff and an 
embedding of R into F,, we will be guided by the example of the familiar 


208 RINGS 


embedding of Z into Q. Elements of Q can be written as quotients of in- 
tegers, but not in a unique manner. This suggests that elements of F should 
be described as quotients of elements of R. Since division of one element 
of R by another is not defined, we must introduce ‘‘formal quotients’. 
This is done as follows. Let M be the set of all ordered pairs (a,b), with 
a and bin R and b #0. We define a relation ~ on M by saying that (@,b) ~ 
(c,d) if and only if ad = be. 


LEMMA 3. The relation ~ is an equivalence relation on /, 


Proof. The statement (a,b) ~ (a,b) is equivalent to ab = ba, which 
holds because R is commutative. Thus ~ is reflexive. If (a,b) ~ (c,d), then 
ad = bc. Therefore cb = da and (c,d) ~ (a,b). Hence ~ is symmetric. Now 
suppose (a,b) ~ (c,d) and (c,d) ~ (e,f). Then ad = bc and cf = de. Multi- 
plying the first equality by f and the second by b, we obtain 


daf = bcf = dbe. 
Since d # O and R is an integral domain, the equality daf = dbe implies 
af = be. Thus (a,b) ~ (e, f) and ~ is transitive. U 


Let F be the set of equivalence classes of ~ on M. For (a,b) in M, we 
will denote the equivalence class containing (a,b) by [a,b]. We think of 
[a,b] as the formal quotient of a and b. We must now define binary oper- 
ations + and X on F in such a way that (F,,+,X) is a field. In any field 
we have 

u 1 xX _ uy + Vx 
voesey vy ‘ 


GC) G)> » 


This suggests the following definitions. 
[a,b] + [c,d] = [ad + bc,bd]. 
[a,b] X [c,d] = [ac,bd]. 


However, we must show that these operations are well defined. Suppose 
[a,b] = [a),b’] and [c,d] = [c’,d’]. Then ab’ = ba’ and cd' = dc’. There- 
fore ab'cd' = a'bc'd and, hence, [ac,bd] = [a'c',b'd'] . This means that the 
product of two elements of F does not depend on the particular repre- 
sentation of the factors, so X is well defined as a binary operation on F. We 
also have 


(ad + bc)b'd’ = ab'dd' + bb'cd'’ = ba'dd' + bb'dc' 
= bd(a'd' + b'c’). 


FIELDS OF FRACTIONS 209 


Thus [ad + bc, bd] = [a'd’ + b'c', b’d’] and + is well defined too. 

We must now verify that (F,+,X) satisfies all of the field axioms. 
This is not difficult, and we leave most of the work as an exercise. The 
additive identity element is 0 = [0,1] and [a,b] = 0 if and only if a = 0. 
The additive inverse of [a,b] is [—a,b]. The multiplicative identity is 
1 = [1,1] and, if [a,b] #0, then the multiplicative inverse of [a,b] is 
[b,a]. 

All that remains is to define a map h of R into F and check that h 
isan embedding. For a in R, set h(a) = [a,1] in F. Then 


h(a+b)=[a+b,1)] =[a,1] + [b,1] =h(a) +h), 
h(ab) = [ab,1] =[a,1] X [b,1] =h(@) X h(b). 


Therefore h is a ring homomorphism. If h(a) = [a,1] = 0, then a = 0. Thus 
his an embedding. This completes the proof of Theorem 2. U 


The field F constructed in the proof of Theorem 2 is called the field 
of fractions of R. Sometimes F is referred to as the field of quotients of R. It 
is important to be aware of the difference between the terms “‘field of quo- 
tients” and ‘“‘quotient field”. The latter refers to a quotient ring R/J of 
R that happens to be a field. To avoid any confusion, we will always use 
the term “‘field of fractions’. 

The field of fractions of Z is Q. After the integers, the next most com- 
monly encountered integral domains are the polynomial rings K [X], where 
K is a field. The quotient field of K LX] is called the field of rational func- 
tions in one variable over K. It is denoted K(X). Elements of K(X) are 
written as quotients of polynomials in K[X]. For example, 


2—3X+4Xx? — x4 
7X + 6X? + 3X? 


is an element of QCX). 
We have answered the first of our two questions concerning embeddings 
of integral domains in fields. It is now time to answer the second. 


THEOREM 4. Let R be an integral domain with field of fractions 
F. If R can be embedded in a field K, then F can also be embedded in K. 


Proof. In Theorem 1 we have already proved a special case of this 
result, and the proof of Theorem 1 may be used here with only minor 
changes. Let h:R—>K be an embedding. The embedding h’:F—K is given 
by h'([a,b]) = h(a)/h(b). As in the proof of Theorem 1, we must show 
that h’ is well defined. The details of this verification and the proof that 
h' is an embedding are left as exercises. 0 


210 RINGS 
EXERCISES 

1 Complete the proof of Theorem 2 by showing that the set F of 
formal quotients is a field. 

2 Leta bea nonzero element of Z3). Show that there is a ring homo- 
morphism of Z3) into a field such that a is not mapped to 0. 
Describe the set of positive integers m such that a similar statement 
can be made about Z, . 

3 Fill in the details in the proof of Theorem 4. 

4 In this section we defined the field of fractions of an integral do- 


main R. It is possible to generalize our definition slightly. We say 
that a field of fractions for R is a pair (F,h) consisting of a field 
F and an embedding h:R—F that satisfy the following condi- 
tions. Whenever f:R—>K is an embedding of R into a field K, 
then there exists an embedding g:F—>K such that the diagram 


er 
§ 

f 
K 


is commutative. Show that any two fields of fractions (F',,h,) and 
(F,,h,) are isomorphic in the sense that there is an isomorphism 
0 :F',—>F, such that the diagram is commutative. 


F, 


a 
R 7) 

ee 
Fy 


9. EUCLIDEAN DOMAINS 


In Chapter 2 we studied the ring Z extensively. Looking back, we can see 
that one of the properties of the integers that was very important for es 
tablishing the existence of greatest common divisors and for proving the 
existence and uniqueness of factorizations into primes was the division 
property. The division property states that, given two integers a and b with 
b #0, there exist unique integers g and r such thata =qb+randO<r< 
|b|. We called qg the integral quotient of a and b and r the remainder when 


EUCLIDEAN DOMAINS 211 


a is divided by b. The procedures ZQUOT and ZREM in CLASSLIB com- 
pute integer quotients and remainders, respectively. 

In this section we will introduce a class of rings for which many of 
the results in Chapter 2 hold. The definition of this class of rings depends 
on a generalization of the division property. In Z we can divide by a non- 
zero integer and get a small remainder, where smallness is measured by the 
absolute value function. In our more general situation, smallness in a ring 
R is measured by a function from R — {0} to the set of nonnegative integers. 

Let R be an integral domain. A Euclidean norm on R is a function 
N from R- {0} to Z such that 


1. O0< N(a@) < N(ab) for alla,b in R — {0}. 
2. If a and b are in R with b #0, then there exist g andr in R such 
that a=qb+rand either r=0 or N(v) < Mb). 


A Euclidean domain is an integral domain possessing a Euclidean 
norm. 


THEOREM 1. The ring Z is a Euclidean domain. 


Proof. For any integer a, set N(a) = |a|. Then N is a Euclidean norm 
onZ. [J 


Note that condition (2) in the definition of a Euclidean norm does not 
require that g and r be unique. Even in Z we generally have two choices 
for r. For example, if a= 7 and b = 5, then 


a=b+2=2b + (-3) 


and both |2| and | —3'| are less than | b|. In Z we get uniqueness by adding 
the extra condition that r be nonnegative. 

In the next example of a Euclidean domain g and  r are always unique. 
Recall that the degree of a nonzero polynomial f is denoted deg(f). 


THEOREM 2. If F is a field, then the function deg is a Euclidean 
norm on F'LX]. 


Proof. For any f #0 in FLX], the degree deg(f) of f is a nonnegative 
integer. If g # O, then deg(fg) = deg(f) + deg(g) = deg(f). The algorithm 
taught in high school for the division of polynomials in R[X] generalizes 
immediately to polynomials over any field and produces the required quo- 
tient and remainder. (J 


As an example, let us consider f = 2X? +X +3 andg=4X4 + 3X3 + 
X? + 2X +4 in Z, |X]. Dividing, 


212 RINGS 


2X7 4+3X+1 
2X7 4+X+3 4x4 +3X374+X74+2X+4 
4X4*4+2X3 +X? 
X37 +0X?27 +2X 
X74+3X?7 +4X 
2X7 +3X+4 
2X27 + X+3 
2X +1 


We find that g = gf +r, where g = 2X? +3X+landr=2X +1. 
The procedures RXQUOT and RXREM compute quotients and remain- 
ders in R[X]. 


A+ 3.2 4 7.4 2 B RXREM A 
Be 1.5 2.11 1.6 0.28 
A RXQUOT B 


Here we see that the quotient and remainder when —3.2 + 4X + 7.4X? + 
2X? is divided by —1.5 + 2.1X + X? are 3.2 + 2X and 1.6 + 0.284, re- 
spectively. It is important to remember that the results of RXQUOT and 
RXREM are generally only approximations of the true quotient and re- 
mainder. Both procedures are defined for arrays of polynomials. In fact, 
all the procedures with the suffixes QUOT and REM are defined for arrays 
over the appropriate Euclidean domain. 

If m is a prime, then Z,[X] is a Euclidean domain by Theorem 2. 
Even if n is composite, we can still perform division in Z,[X] provided 
the leading coefficient of the divisor is a unit in Z,. (Why?) The proce- 
dures ZVXQUOT and ZNXREM compute quotients and remainders in 
Z, [|X], where n is given by WV, which must not exceed 107. 


N+5 F ZNXREM G 
F+3 1 2 1 2 
G4 21 3 4 1 2 ZNXSUM 13 2 ZNXPROD F 
G ZNXQUOT F 4213 4 
13 2 


In a Euclidean domain the units are easily determined. 


THEOREM 3. Let R be a Euclidean domain with norm N. If O #b in 
R, then b is a unit in R if and only if N(b) = N(1). If 5 is not a unit, then 
N(ab) > Na) for alla #0. 


Proof. Let a and b be nonzero elements of R. Then N(ab) > Ma). 


EUCLIDEAN DOMAINS 213 


Suppose N(ab) = N(a). We can find g and r in R such that a= q(ab) +r and 
either r = 0 or N(r) < N(ab) = Ma). But r = a(l — qb) and so, ifr #0, 
then NM(r) > N(@). Thus r = 0 and a = agqb. Since R is an integral domain, we 
may cancel the factor a and obtain 1 = qb. Therefore b is a unit. Taking 
a= 1, we see that N(b) = N(1) implies that 5 is a unit. But if b is any unit, 
then bc = 1 for some c in R and N(1) = Nb). Since N(b) = N(1b) =N(1), 
this means that N(b) = N(1). Thus the units in R are the elements b with 
N(ib)=N(1). O 


There is one more ring that we have already encountered that is a 
Euclidean domain. 


THEOREM 4. The ring Z[i] of Gaussian integers is a Euclidean domain. 


Proof. Since Z[i] is a subring of C, Z[i] is certainly an integral domain. 
For any complex number z = x + yi, with x and y real, we defined the norm 
N(z) to be x? + y? =2zz. This norm function satisfies the condition N(zw) = 
N(z)N(w). (See Exercise 3.6.) If z and w are in Z[i] andz # QO, then N(z) > 
1 and N(zw) = N(w). All that remains is to prove that we can write w = 
qz + r with N(r) < N(z). Since C is a field, we can write w/z =a + bi, where 
a and b are in R (in fact, in Q). Choose integers u and y such that |a — u| < 
1/2 and |b — y| < 1/2. Set g =u + vi andr = w — qz. Then 

= z(a+ bi) —qz=z[(a—u)+(b—v)i. 
Therefore 1 l 
N(r) = N(z)N[(a — u) + (b — v)i] < N(z) € + ‘) <N(z). OU 


The library CLASSLIB contains procedures 
GAUSSQUOT and GAUSSREM 


for computing quotients and remainders in Z[i]. For example, if z = —2 + 
3i and w = 10 — 7i, then the calculations 


Z« 2 3 R CSUM Q CPROD 2Z 
W+10 7 10 7 
O<Q<W GAUSSQUOT Z CNORM Z 
3 01 13 
O<+R<Z GAUSSREM W CNORM R 
1 0 1 


show that w = qz +r, where g = —3 —iandr=1, and Mr)< NZ). 

In Chapter 2, one of the first facts we proved about Z was that every 
ideal, in fact, every subgroup of (Z,+), is generated by one element. In a 
commutative ring R, an ideal J is called principal if [ = Rx for some x in 
R. A principal ideal domain is an integral domain in which every ideal is 
principal. The phrase “‘principal ideal domain” is often abbreviated “‘PID’’. 


214 RINGS 


THEOREM 5. Every Euclidean domain is a PID. 


Proof. Let R be a Euclidean domain with norm WN and let J be an ideal 
of R. If J= {0}, then J = RO, so J is principal. If J # {0}, then {N(x)[ « el - 
{0O}} is a nonempty set of nonnegative integers and so has a smallest element. 
Choose x in J/—{0} such that N(x) <N(y) for all y in J— {0}. Clearly, J > Rx. 
To prove J = Rx, we must prove J < Rx. Suppose y is in /. We can find 
gq and r in R such that y = gx + r and either r = 0 or N(r) < N(&X). Now 
r = y — qx and sor isin J, Hence, ifr #0, then Mr) = M(x). Thus rs = 0 
and y = qx isin Rx. Therefore C Rx. OU 


The converse of Theorem 5 is false. There do exist PID’s that are not 
Euclidean domains. An example may be found in Motzkin. 

The next result exhibits an important property of PID’s and hence of 
Euclidean domains. 


THEOREM 6. Let R be a PID and let J, C J, C ... be an infinite 
sequence of ideals in R with each ideal contained in the next. Then there 
is an integer m such that J,, =J, for all m > n. 


Proof. Let 


We claim / is an ideal of R. Let a and b be elements of J and let r be in 
R. Then a € I, and b € I, for some p and q. We may assume p > q and 
so, in fact, both a and b are in J,. Therefore ra, a + b, and a — b are in 
I, and hence in /. Thus J is an ideal. But R is a PID, so J = Rx for some 
x in J. There exists an n such that x e J,,. Suppose m =n. Then 


In CIm CL=Rx Cy. 
Therefore J,, =I, forallm>n. Q 


A ring whose ideals satisfy the condition of Theorem 6, that is, a ring 
with no infinite strictly increasing sequence of ideals, is said to satisfy the 
ascending chain condition on ideals. 

The class of Euclidean domains contains most rings in which we will 
want to perform computations. The existence of a Euclidean norm makes 
calculation much easier than in an arbitrary ring. Most of Chapter 6 is de- 
voted to a study of algorithms for performing certain types of computa- 
tions related to Euclidean domains. However, we should point out that the 
rings Z[X] and FLX,Y], where F is a field, are important rings that are 
not Euclidean domains; in fact, they are not PID’s. 

We close this section with some facts about polynomials with co- 
efficients in a field. 


EUCLIDEAN DOMAINS 215 


Let F be a field and let f be a polynomial in FLX]. An element c of 
F is called a root of f if f(c) = 0. Every element of Fis a root of the zero 
polynomial. However, as the next result shows, nonzero polynomials in 
F[X] cannot have many roots. 


THEOREM 7. Let F be a field and let f be an element of F[X] with 
deg(f) =n > O. Then f has at most n roots in F. 


Proof. We proceed by induction on n. Suppose n = 1. Then f = aX + 
b, where a # 0. If 0 = f(c) = ac + b, then ac = —b and c = —b/a. Thus f has 
exactly one root in F. Suppose now that n > 1. Clearly, we may assume 
f has at least one rootcinF. Dividing f by X — c, we can write f= (X —c)q + 
r, where g is in F[X] andr is in F. The degree of g ism — 1. Evaluating f at 
c, we find that 0 = f(c) = (c — c)q(c) tr=r. Thusr = Oandf=(X — c)q. 
Suppose d is a root of fin F with d #c. Then 0 = f(d) = (d — c)q(d@). Since 
d —c #0 and F has no zero divisors, this means that g(d) = 0, sod is a root 
of g. By induction, gq has at most n — 1 roots in F. Thus f has at most 
nrootsinF. 


The following corollary of Theorem 7 will be important for the poly- 
nomial factoring algorithm described in Section 12. 

COROLLARY 8. Let F be a field and let f and g be elements of FLX] 
such that deg(f) and deg(g) are each at most n. If there exist distinct ele- 
ments dy,...,@, Of F such that f(a;) = g(a;),O <i <n, then f=g. 

Proof. Let h = f — g. Then, for 0 <i <n, we have h(a;) = f(a;) — g(a) = 
0, so each a; is aroot of h. By Theorem 7, A must be a constant polynomial, 
and soh =Q. This means that f=g. U 


EXERCISES 


1 Let N be a Euclidean norm on the integral domain R. Suppose 
c is a nonnegative integer and N'(x) = N(x) + c for all x in R. 
Show that N’ is a Euclidean norm on R. 

2 Let F be a field. For O #x in F, set N(x) = O. Show that JN is 
a Euclidean norm on F. 

3 Let F be a field. Give an explicit description of the division algo- 
rithm in F[X]. That is, if f=a, +a,X +...+a,X" andg= 
bop t+...+b,X”™ are in F[X] with g #0, tell how to compute 
the coefficients of the polynomials g and r such that f= qgt+r 
and either r = 0 or deg(r) < deg(g). 


4 A monic polynomial is one whose leading coefficient is 1. Let 


216 RINGS 


f and g be in Z[X], with g monic. Show that there exist g and 
rin Z[X] with f=qgtr and either r = 0 or deg(r) < deg(g). 


5 Let z and w be Gaussian integers with z # 0. Determine the maxt- 
mum number of pairs (g, r) such that w = qz +r and N(r)< N(z). 
State an additional condition that makes r unique. 


6 Letf=1+X+3X2 +3X27 + 4X4 4+ 5X5 andg=5+2X+3X’ 
in Z,[X]. Compute by hand the quotient and remainder when 
f is divided by g. Check your answer using ZNXQUOT and ZNXREM. 


7 Let z=3 — 2i and w = —11 + 7i. Find all pairs (g, r) of Gaussian 
integers such that w = gz trand N(r) <N(z). 


8 Let J be the set of all polynomials ay ta,X+...+a,X" in Z[X] 
with a) even. Show that J is an ideal of Z[X] and that J is not 
principal. 

9 Let F bea field. Prove that F[X,Y] is not a PID. 

*10 Show that Z[X] satisfies the ascending chain condition for ideals. 


11 Let R be a ring in which every ideal J is finitely generated in that 
there exist x,,...,X, inJ such that J is the smallest ideal in R 
containing all of the x;. Show that R satisfies the ascending chain 
condition for ideals. 


12 Define the descending chain condition for ideals in a ring. Does 
Z satisfy this condition? 


13. Show that the requirement in Theorem 8 that F be a field is neces- 
sary by determining the number of roots of X? — 1 in the ring 
Zs . 


10. FACTORIZATION 


Let a and b be elements of an integral domain R. We say a divides b in 
R if there is an element c such that b = ac. We also say that a is a divisor 
of b, that a is a factor of b, and that b is divisible by a. In Z two integers 
a and b can be divisors of each other if and only if a = +b. The following 
theorem generalizes this fact. 


THEOREM 1. Let a and b be elements of the integral domain R. Then 
the following are equivalent. 

(a) a divides b and b divides a in R. 

(b) .Ra = Rb. 

(c) There is a unit uw in R such that b = ua. 


Proof. If a divides b, then b = ac for some c in R. If r is any element 
of R, then rb = (rc)a, so Ra D Rb. If, in addition, b divides a, then a = 


FACTORIZATION 217 


bd for some d in R and Rb D Ra. Thus Ra = Rb and so (a) implies (b). 
Also, we have a = bd = acd. If a # 0, we may cancel the factor a in the 
equation a = acd to obtain 1 = cd, and soc and d are units. If a = 0, then 
b = ac = O. Here b = ua with u = 1. Hence (a) also implies (c). It is easy to 
see that (b) and (c) each imply (a). If Ra = Rb, then b is in Ra, and so 
b = ca for some c. Similarly, a = bd for some d. Thus (b) implies (a). Fi- 
nally, if b = ua, with u a unit in R, then obviously a divides b. If v is the 
inverse of u, then by = uva = a and so (c) also implies (a). U 


If a and b satisfy any one, and hence all, of the conditions in Theorem 
1, we say a and BD are associates. It is easy to show (see Exercise 1) that the 
relation “‘is an associate of” is an equivalence relation. In some integral 
domains there is a natural choice for a representative in each class of asso- 
ciate elements. For example, in Z each class contains a unique nonnegative 
element, while in the polynomial ring F[X] over a field F every nonzero 
polynomial is an associate of a unique monic polynomial, one whose leading 
coefficient is 1. In Z[i] an element z # 0 has four associates, z, iz, —z, and 
—iz. It is less obvious here which element to choose as the representative 
for the class. See Exercise 3. 

In an arbitrary integral domain R there is no notion of the relative 
size of two elements, but we can still define the concept of a greatest com- 
mon divisor. If a and b are in R, then an element d of R is called a greatest 
common divisor (gcd) of a and b, provided the following conditions hold. 


1. disa divisor of both a and b. 
2. If cis an element of R that divides both a and 5D, then c divides d. 


THEOREM 2. Any two greatest common divisors of a and b are asso- 
Clates. 


Proof. Suppose d, and d, are two greatest common divisors of a and 
b. Then d, divides both a and b and so d, divides d,. By symmetry, d, 
divides d,, sod, and d, are associates. DU 


In general, two elements of R need not have a greatest common divi- 
sor, but this cannot happen in a PID. 


THEOREM 3. Let R be a PID. If a and b are in R, then a and b have a 
greatest common divisor d and d =ra+sb forsomer ands inR. 


Proof. The ideal of R generated by a and b isJ]=Ra+Rb. Since R isa 
PID, we know that J = Rd for some element d in R. Clearly, d has the form 
ra + sb. Also, since a and b are both in J, it follows that d divides both 
a and b are both in J, it follows that d divides both a and Db. Finally, if 
c divides both a and b, then c divides ra + sb = d. Therefore d is a greatest 
common divisor ofaandb. (J 


218 RINGS 


In Section 2.2 we learned how to compute greatest common divisors 
of integers by repeated computation of remainders. For example, 


57 ZREM 91 11 ZREM 23 
34 1 

34 ZREM 57 1 ZREM 11 
23 0 

23 ZREM 34 


11 


shows that gcd(57,91) = 1. This Euclidean algorithm works in any Euclidean 
domain. To compute the greatest common divisor of 1 + 2X + 3X? + 3X3 
and2+X+2X? +2X% in Z,[X], we proceed as follows. 


N+5 O+D+C ZNXREM A 
A+1 2 3 3 3 4 
B+2 1202 O+E<D ZNXREM C 
[<C+A ZNXREM B 0 

101 


We normally replace the result 3 + 4.X by its monic associate. 


O<+D+D ZNXPROD ZNINV 14D 
21 


Thus we obtain gcd(1 + 2X + 3X2 +3X3,24+X+2X2 +2X4)=2+Xin 
Z,(X]. Similarly, RXREM and GAUSSREM may be used to compute greatest 
common divisors in R[X] and Z[i], respectively. As usual, RXREM must be 
used with caution, since round-off errors can be disastrous. 

The fact that Theorem 2.4.2 is called the Fundamental Theorem of 
Arithmetic indicates the importance of being able to factor positive integers 
into a product of primes in essentially one way. A prime p in Z is a nonzero 
nonunit that satisfies each of the following two conditions. 


1. If p=ab witha and b in Z, then either a or b is a unit in Z. 


2. If p divides a product ab of integers, then either p divides a or p 
divides b. 


In Z these conditions are equivalent. However, if we replace Z by an arbi- 
trary integral domain, then conditions (1) and (2) are no longer equivalent. 

Let R be an integral domain. An element p # O of R is said to be 
irreducible in R if p is not a unit and, whenever p = ab with a and b inR, 
then either aor Db is a unit. A prime in R is a nonzero nonunit p such that 
whenever p divides a product ab of elements in R, then either p divides 
a or p divides b. An easy induction argument shows that if a prime p divides 


FACTORIZATION 219 


a product a,a, ... a, of elements of R, then p divides one of the factors 
Qj. 


THEOREM 4. Every prime element in R is irreducible. 


Proof. Suppose p is a prime in R and p = ab. Then certainly p divides 
ab and, therefore, p divides a or b. We may assume p divides a. In this 
case, a = pc and p = pcb for some c in R. Since p # 0, we may cancel the 
factor p to obtain 1 = cb. Thus db isa unit. OU 


There exist integral domains that contain irreducible elements that 
are not primes. For example, let R = Z[,/—5]. It is not hard to show (see 
Exercise 22) that 3, 2 +./—5, and 2 — ,/—5 are irreducible in R. Now 


(3) (3)=9 = (2 +4/—5) (2 —/—5). 


Thus 3 divides the product of (2 +./—5) and (2 —./—5). However, it is 
easy to see that 3 does not divide either factor, so 3 is not a prime in R. 

In a PID the distinction between primes and irreducible elements 
vanishes. 


THEOREM 5. If R is a PID, then every irreducible element in R is a 
prime. 

Proof. Suppose R is a PID and p is irreducible in R. Assume that p 
divides ab. By Theorem 3, a and p have a greatest common divisor d, and 
d can be written as ra + sp with r and s in R. If d isa unit, we may assume 
(see Exercise 2) that d = 1. Multiplying the equality 1 =ra+tsp by b, we ob- 
tain b = rab + spb. Since p divides both ab and p, we see that p divides 
b. If, on the other hand, d is not a unit, then p = cd for some c in R, and 
since —p is irreducible, c must be a unit. Now d divides a, and so a = qd = 
qc'p for some qg in R. Therefore p divides a. We have shown that p divides 
aor p divides b, sopisprime. 0 


A unique factorization domain (UFD) is an integral domain R such 
that each nonzero element a of R that is not a unit can be factored as 


@2-PiP2.--.-Pr, 


where each p; is irreducible. Moreover, we require this factorization to be 
unique up to order and associates. This means that if 


2-G192..-4s 


with each q; irreducible, then r = s and there is a permutation o of {1,..., 7} 
such that g; is an associate of p;,. By Theorem 2.4.2, we know that Z is a 
UFD. The ring Z[./—5] is not a UFD, since 


(3) (3) = (2 +./—5) (2 —V/-S) 


220 RINGS 


and 3 is not an associate of either of the elements 2 + ./—5, and 3 and 
2 +,/—5 are all irreducible. 


Theorem 5 remains true when we replace “‘PID’’ by ““UFD’’. 
THEOREM 6. Let R be a UFD. Every irreducible element in R is prime. 


Proof. Let p be an irreducible element in R and suppose p divides 
uv. Then uv = pw for some w in R. We must show that p divides u or p 
divides vy. If u or v is O, then clearly we are done. Thus we may assume 
neither uw nor v is O. If u isa unit, then y = pwu™! and p divides v. Hence we 
may also assume that neither uw nor pv is a unit. Since R is a UFD, we may 
write uU =u, ... Uy, V=Vy...Vs,andw=w,... wy, where each u;, v;, and 
wx is irreducible. Moreover, we have 

DW, ... We =Uy... UpVy... Vs. 


By the uniqueness of factorizations in R, we know that p is an associate 
of some u; or v;. If p is an associate of u;, then p divides u; and thus p 
divides u. If p is an associate of v;, then p divides v. Therefore p divides 
eitheruory. QO 

Now we come to the main result of this section. 

THEOREM 7. Every PID is a UFD. 


Proof. Let R be a PID. We will prove the existence and uniqueness 
of factorizations in R by a sequence of lemmas. 


LEMMA 8. There do not exist infinite sequences a,, a ,... 
iand b,, b,, ... of elements of R such that each a; is a nonzero nonunit, 
each b; is nonzero, and 


a,b; =a,a,b, =4,a,a3b3 =... 


Proof. Suppose such a pair of sequences existed. Cancelling the factor 


a, ... @ in the equality a, ... ajbj =a, .. . a+, bj+,, we see that b; = 
Qj+1 bj+1, and so b;+, divides b; for alli > 1. Set J; = Rb;. Then J, Cl, © 
I; C.... By Theorem 9.7 there exists an integer nm such that J,, =J, for all 


mz=n. In particular, J, = [,+,, and so b, and b,+ , must be associates. This 
means that a,4, is a unit, which contradicts our assumption that each 
a;isanonunit. 


LEMMA 9. Let a be a nonzero nonunit in R. Then a is divisible by an 
irreducible element. 


Proof. If a is irreducible, we are done, since a is divisible by itself. If 
a is not irreducible, we can write a = a,b,, where a, and b, are nonzero 
nonunits. If b, is irreducible, then again we are done. If b, is not irreducible, 
we can write b, = a,b,, where a, and b, are nonzero nonunits. Continuing 
in this way, we see that the assumption that a is not divisible by an irre- 


FACTORIZATION 221 


ducible element leads to the construction of a pair of sequences that Lemma 
8 says cannot be constructed inR. J 


LEMMA 10. Every nonzero nonunit in R can be factored into a product 
of irreducible elements. 


Proof. Let a be a nonzero nonunit. By Lemma 9, we have a=a,),, 
where a, is irreducible. If b, is a unit, then a is itself irreducible (see Exer- 
cise 9), and we are done. If b, is not a unit, then b, = a,b,, with a, irre- 
ducible. If b, is a unit, then b, is irreducible, and we are done. Otherwise, 
we can factor b, as a3;b3 with a, irreducible. If we are not to produce a 
pair of sequences of the type ruled out by Lemma 8, we must eventually 
arrive at a factorization a = a,a, ... a,by,, in which b, and each a; are 
irreducible. 


We complete the proof of Theorem 7 with the following. 


LEMMA 11. The factorization in Lemma 10 is unique up to order 
and associates. 


Proof. Suppose p; .. . Pr = G1... ds, Where each p; and q; is irre- 
ducible in R. If r = 1, then s must be 1, since p, is irreducible. Thus we may 
assume r and s are each at least 2 and proceed by induction on r. By The- 
orem 5, p, is a prime. Since p, divides the product g, ... ds, we know that 
p, divides some g;. Renumbering the q’s, we may assume p, divides q,. But 
q, is irreducible, so g, = up,, where uw is a unit in R. Replacing g, by its 
associate u~'g,, we have p; ... Pp =Pid>... Qs. Thusp.... Pr =z... Qs- 
By induction, r = s, and we may rearrange the q’s so that p; and gq; are 
associates, 2 < i < r. Thus the factorization p, .. . p; is unique up to order 
and associates. U 

Theorem 7 together with Theorem 9.5 tells us that every Euclidean do- 
main is a UFD. Thus the rings F[X], where F is a field, and Z[i] are UFD’s. 
Let us investigate factorization in these rings in more detail. 

If F is a field, then any polynomial fin F[X] of positive degree can be 
factored as f=df,f, ...f,, where d is in F and each f; is a monic irreducible 
polynomial. The f; are unique up to order. There are infinitely many monic 
irreducible polynomials in F[X]. If a is in F, then X + a is irreducible. 
Hence, if F is an infinite field, then there are infinitely many monic irre- 
ducible polynomials of degree 1. If F is finite, it can be shown (see Exercise 
16) that irreducible polynomials can be found in F[X] with arbitrarily 
large degree. 

If p is a prime and m is a positive integer such that p” is not too 
large, say p” < 1000, then the monic irreducible polynomials in Z, [X] 
with degree at most m can be listed with a simple variation of the sieve 
technique used in Section 2.4 to list primes in Z. For example, if 


222 RINGS 


OTO0<0 
N<«2 
D<+A<+O82 2 2 2724116 


POROROPROKMOKROKHRO 
PROORPRPOORHROOP,LE 
PRPRPRRPODDORREBROSD 
PRPPERPPEEREOOOO0O°O 


then the rows of A list the 14 nonconstant polynomials in Z,[X] with de- 
gree not exceeding 3. All of these polynomials are monic. The first row of 
A represents the polynomial X, which is irreducible. We can sieve out the 
multiples of this polynomial by 


Y<«v/0#AL0;] ZNXREM A 
[]+A<+Y7A 
@) 


Ps pL Ps PL PAE PS 
RPOROROR 
BPREOORKO 
PRE OO 


Here we compute the remainder when each row of A is divided by the first 
row of A in Z,[X]. The vector Y is the characteristic vector of the set of 
rows in A that are not divisible by the first row. We use Y to select the rows 
of A we wish to keep. 

The new first row of A corresponds to 1 + X and is also irreducible. 
Sieving out its multiples, 


Y<v/0#AL0;] ZNXREM A 


LI<A<Y7A 
1 110 
110 1 
101 1 


FACTORIZATION 223 


“o 


we obtain a matrix whose rows list polynomials of degree 2 and 3. Any 
polynomial of degree 2 or 3 that is not irreducible must have a factor of 
degree 1. Since we have eliminated all polynomials with a linear factor, 
these three polynomials must be irreducible. Thus the irreducible poly- 
nomials in Z,[X] with degree at most 3 are X¥,1 +X, 1+X+X?,1+ 
X+X? and1+X? 4X3. 

If M is a positive integer, then A<«ZNXMONIC M isamatrix whose rows 
list the monic polynomials of degree at most M with coefficients in Z, |X], 
where n is VV. 


N 
2 

ZNXMONIC 3 
0100 
110 0 
001 0 
101 0 
0110 
111 0 
0001 
1001 
0101 
1101 
0011 
10141 
01141 
1111 


If W is prime, then ZNXIRRED uses the sieve method to produce a matrix 
listing the monic irreducible polynomials up to a given degree. 


U<B<+ZNXIRRED 3 


010 0 
1 10 0 
1 11 0 
1101 
1011 


If f is a polynomial of degree m in Z,[X], p being a prime, then one 
method of factoring f is simply to divide f by all monic irreducible poly- 
nomials in Z,[X] whose degrees do not exceed m/2. For example, to factor 
f=1+xX+X4*4 X° in Z,[X], we divide f by the irreducible polynomials 
listed in the matrix B. 


224 RINGS 


B ZNXREM 11004101 


Ooo0 
re OOO © 


The second, third, and fourth rows of B give a remainder of O, and f = 
(+xX)(1+xX+X7)(1+X+X3). 

If p” is large, we cannot expect to have available the list of monic 
irreducible polynomials of degree up to m/2 in order to factor a given 
polynomial in Z,[X J of degree m. There is a better algorithm for factoring 
polynomials in Z,[X]. This algorithm, which is based on the theory of 
finite fields, is discussed in Section 7.4 and is used in the procedure 
ZNXFACTOR. If the vector F lists the coefficients of a polynomial f in 
Z,[X] of positive degree, then G«ZNXFACTOR F is a matrix whose rows 
list the monic irreducible factors of f The global variable V, which gives the 
value of p, must be a prime less than 10’. 


N«2 
ZNXFACTOR 11001041 


In the second example we obtained the factorization 
752+ 346X + 713X? +X? =(311 +X) (661 + 402% + X27) 


in Zio19 LX]. 

The algorithm used in ZVXFACTOR is different in one respect from 
any of the other algorithms we have encountered so far in that it is prob- 
abilistic. This means that at certain points in the computation random 
choices are made, in this case of elements of 1 W. It is possible that these 
choices will not lead to the factorization. However, the probability that 
this will happen for any particular polynomial is less than 107!°. Although 
ZNXFACTOR may report failure to find a factorization, it will never give 
an incorrect factorization. 


FACTORIZATION 225 


In CLX] every irreducible polynomial has degree 1. This fact, which 
is known as the Fundamental Theorem of Algebra, is really a theorem in 
analysis and is usually not proved in introductory algebra texts. Another 
way to state this result is as follows. Every element in C[X] of positive 
degree has a root in C. In R[X] irreducible polynomials have degree 1 or 
2. If f is in RLX] and deg(f) > 0, then f has a root c in C by the Funda- 
mental Theorem of Algebra. If c is in R, then X — c divides f in R[X]. 
If c is not in R, then c #c and f(c) = f(c)= 0 = O, since f has real coef- 
ficients. Therefore f is divisible in C[X] by X — c and X — c and hence 
by g = (X —c) (X — ©). But g = X*? — (c +c)X + cc has real coefficients, 
so g divides fin R[LX]. 

The procedure RXFACTOR computes approximate factorizations 
in RLX]. If ¢ is a vector listing the coefficients of a polynomial f in RLX] 
of positive degree, then G<«RXFACTOR F is a matrix whose rows list ap- 
proximations to the monic irreducible factors of fin R[X]. 


RXFACTOR 2 1 30 141 


2 1 0 
1 1 1 
1 O 1 


Here we have obtained the exact factorization 
—~2—~X— 3X? —X44+ X5 =(-24+X)(1+X+X7) (14+ X7) 


in RLX]. 

Techniques for approximate factorization in R[X] are properly part 
of the branch of mathematics called numerical analysis and will not be 
discussed in detail in this book. More information can be found in books 
on numerical analysis. 

Let us now turn to the problems of factorization and the determina- 
tion of primes in Z[i]. We will show that it is possible to reduce these 
problems to the corresponding ones in Z. As usual, the norm of the com- 
plex number z will be written N(z). 


THEOREM 12. If z is in Z[i] and N(z) = p is a prime in Z, then z is 
a prime in Z[i]. 

Proof. Suppose N(z) = p is a prime in Z and z = uy with uw and p in 
Z{i]. Then p = N(z) = N(u)N(v) and so either N(u) = 1 or N(v) = 1. But 
elements in Z[i] with norm | are units, and soz isa prime. U 


By Theorem 12, the numbers 1 +7, 1 — 2i, and —2 + 3i are all primes 
in Z[i]. 


226 RINGS 


THEOREM 13. Let z be a prime in Z[i]. There is a prime p in Z such 
that z divides p in Z[i]. 


Proof. The norm N(z) = zz is an integer greater than 1 and can be 
factored as a product p, ...p, with each p; a prime in Z. Since z divides 
N(z), it follows that z divides some p;. 0 


Theorem 13 reduces the problem of finding the primes in Z[i] to the 
factorization in Z[i] of the primes in Z. 


THEOREM 14. Let p be a prime in Z. Then either p is a prime in 
Zli] or p = uy, with u and vy conjugate primes in Z[i]. 


Proof. Suppose p = uv in Z[i]. Then p? = N(p) = N(u)N(0). If neither 
u nor vy is a unit, then N(u) = N(v) =p. In this case, both u and pv are primes 


in Z[i] by Theorem 8 and, since p = N(u) = uu, we see thatv=u. O 


If a prime p in Z is not prime in Z[i], then by Theorem 14 there exist 
integers x and y such that p = (x + yi) (x — yi)=x? + y?. Thus, whether or 
not p factors in Z[i] depends on whether or not p can be written as a sum 
of two integer squares. Since the factorization of 2 as (1 +i) (1 — i) is easily 
checked, we may assume p is odd. 


THEOREM 15. If p isa prime in Z and p = 3 (mod 4), then p is a prime 
in Z[i]. 

Proof. Suppose p = x? + y? with x and y in Z. Modulo 4 the only 
squares are 0 and 1. Thus x? + y? is congruent modulo 4 to 0, 1, or 2. If 
p is a prime and p = 3 (mod 4), then p is not the sum of two integer squares 
and p is prime in Z[i] by Theorem 14. UO 


This leaves the primes in Z congruent to | modulo 4. A theorem of 
the French mathematician Pierre de Fermat (1601?-1665) states that every 
prime of this type has a nontrivial factorization in Z[i]. Before proving 
Fermat’s result, we will present an example showing one way to find the 
factorization using APL. 

The integer 337 is prime and congruent to 1 modulo 4. If we wish to 
express 337 asx” + y?, with x and y in Z, then we may assume 0 <x < 
y, and sox <\/337/2 = 12.9... . Thus we may proceed as follows. 


LJZO0O<1 
U+(337-(112)*2)*0.5 
O<X<+(U=LU)11 

e) 
O+Y<LULX] 

16 


Here we find that 337 = (9 + 16i) (9 — 16i). This method is not efficient 


FACTORIZATION 22] 


for large primes. A better method is described in the paper by Brillhart 
cited in the bibliography. 


THEOREM 16 (Fermat). Let p be a prime in Z with p = 1 (mod 4). 
Then there exist integers x and y withp=x? +y?. 


Proof. By Theorem 14 it is enough to show that p is not a prime in 
Z{i]. Let u = (1) 2) (3)... @ — 1)/2. Since p = 1 (mod 4), it follows 
that (p — 1)/2 is even, and so 

u=(—1)(-2)(-3)...[— @— 1)/2]. 
But —k =p — k (mod p), and thus 


u=(p-1)@-2)@-3)...[@+ 1/2] (mod p). 


Therefore 
 3(1)(2)(3)... (25) (25 ).@=D=@=1 (mod p). 


At this point we invoke Wilson’s Theorem (Corollary 3.11.9), which states 
that (p — 1)! =—1 (mod p). 

By Wilson’s Theorem, we have u* = —1 (mod p) or u* + 1 = cp for 
some integer c In Z[i] we can factor u? + 1 as (u + i) (u — i). Since p 
divides neither u + inoru—iin Z[i] but does divide their product, it follows 
that p cannot be a prime in Z[i]. O 


The procedure GAUSSFACTOR factors a nonzero nonunit in Z[i] 
into primes. 


GAUSSFACTOR 19 9 


1 41 
2 3 
174 


The product of the factors is in general only an associate of the original 
number. 


EXERCISES 


In the following exercises R is an integral domain. 
1 Show “‘is an associate of” is an equivalence relation on R. 


2 Suppose d is a greatest common divisor of the elements a and 5 in 
R. Show that any associate of d is also a greatest common divisor 
of a and b. 

3 Show that any nonzero Gaussian integer is an associate of a unique 
Gaussian integer a + bi with a > 0 and b = 0. 


4 Compute the monic gcd of 4+ 6X + 3X? + 6X4 + 2X° and 5 + 


228 


*8 


15 


16 


17 


18 


19 


RINGS 


4X* + 2X3 + 3X4 + 6X5 + 3X® in Z,[X] using the Euclidean 
algorithm, as in the example in the text. 

Compute a gcd of 33 — 4i and —6 + 35i in Z[i] using the Euclidean 
algorithm. The procedure GAUSSREM may be used to compute 
remainders. 


Give a formal description of the Euclidean algorithm in a general 
Euclidean domain. 

Let F be a subfield of the field K. Then F[X] is a subring of K LX]. 
Suppose f and g are in F[X]. Is the monic greatest common divisor 
of f and g in F[X] necessarily the same as the monic greatest com- 
mon divisor of fand g in K[X]? 

Let f and g be in Z[X] and suppose for all primes p the poly- 
nomials f and g in Z,[X] represented by f and g, respectively, 
have a nonconstant common factor. Show that f and g have a non- 
constant common factor in QLX]. 

Let a be irreducible in R. Show that any associate of a is also ir- 
irreducible. 


Do Exercise 9 with “‘irreducible”’ replaced by “‘prime’’. 

For what Gaussian integers z is z an associate of z? 

Suppose z is prime in Z[i]. Show that z is also prime. 

Show that in any UFD greatest common divisors always exist. 


Give a definition for a least common multiple of two elements in 
R. Show that if R is a UFD, then least common multiples exist 
and the intersection of two principal ideals is principal. 


Let f be a nonconstant polynomial in ZLX]. Suppose p is a prime 
that does not divide the leading coefficient of f Show that iff is 
irreducible modulo p, that is, in Z,[X], then f is irreducible in 
Z|X]. Give an example showing that the converse is false. 


Prove that for any field F the ring FX] has infinitely many monic 
irreducible elements. 


Set V<5 and A<ZNXMONIC 4. Use the sieve technique of the text 
to determine the irreducible polynomials in Z,[X] of degree at 
most 4. Check your result with ZNXIRRED. 


Determine the monic irreducible polynomials in Z3[X] of degree 
at most 3 using the approach of Exercise 17. 


Factor 5, 29, and 653 into primes in Z[i] without using 
GAUSSFACTOR. 


POLYNOMIAL RINGS OVER UFD’S 229 


20 Factor z = 23 + 11i into primes in Z[i] without using 
GAUSSFACTOR. 
[Hint: First factor N(z) in Z.) 


21 Make a list of one representative from each class of associate 
primes in Z[i] having norm at most 50. 


22 Let m be an integer that is not a square. For an element z =a + 
b./m in Z[./m], with a and b in Z, set M(z) = a? — b?m. Show 
for all z and w in Z[,/m] we have M(zw) = M(z)M(w). Prove that 
z is a unit in Z[,/m] if and only if M(z) = +1. Show that when 
m <Q, the ring Z [./m] has only finitely many units. Show also 
that if m <0, then a nonzero element of Z[./m] has only finitely 
many divisors and describe a process for determining all divisors of 
an element in Z[,/m]. 


23 By trial and error, find nonzero integers a and b satisfying a* — 
5b? = 1. Thus a + b,/5 is a unit in Z[,/5]. Show that the group 
| of units in Z[,/5] is infinite. 
*24 For any positive integer m that is not a square, prove that the set 
of positive units in Z[,/m] is a cyclic group. 
25 Show that the ideal of Z[,/—5] generated by 3 and 2 +./—5 is 
not principal. 


11. POLYNOMIAL RINGS OVER UFD’s 


In Section 9 we showed that every Euclidean domain is a PID and in Sec- 
tion 10 we proved that every PID is a UFD. This allowed us to conclude that 
the Euclidean domains Z[i] and F[X], where F is a field, are all UFD’s. In 
this section we will discuss the following important theorem, which implies 
that several other familiar rings are UFD’s. 


THEOREM 1. Let R be a UFD. Then R[X] is also a UFD. 


Taking R in Theorem | to be Z, we see that Z[X] is a UFD. Also, since 
F[X,, ..., Xn] is the polynomial ring in one variable over F[X1,..., 
Xy,—-1], a Simple induction argument shows that F[LX,,..., X;,] isa UFD 
too. The rings Z[X] and F[X,,...,X,],n > 1, are not Euclidean domains. 
In fact, they are not even PID’s. 

The proof of Theorem 1 involves three basic ideas. By Theorem 8.2 
we may consider R to be a subring of its field of fractions F. Thus R[X] is 
a subring of F[X]. By Theorems 9.2, 9.5, and 10.7, we know that FLX] is 
a UFD. Finally, using our knowledge about factorization in R and in F[X], 
we can demonstrate unique factorization in R[X]. All of these ideas are 


230 RINGS 


present in the proof of Theorem | for the case R = Z. Because of this, we 
will restrict ourselves to proving that Z[X] is a UFD and leave the generaliza- 
tion to the reader. 

Let f=a), +a,X +...+a,X” be a nonzero element of Z[X]. The 
content of f is the integer c(f) = gcd(@o,..., @,). We say f is primitive if 
c(f) = 1. We can write f = c(f)g, where g is a primitive polynomial in Z[X]. 
In a moment we will show that irreducibility in ZLX] and in QLX] are equiv- 
alent for primitive polynomials. This is not true for nonprimitive poly- 
nomials. For example, if f= 2 + 2X7, then 2(1 + X7) is a factorization of 
f as a product of two nonunits in Z[X]. However, 2 is a unit in QLX] and 
f is irreducible in QLX]. 


LEMMA 2 (Gauss’ Lemma). If f and g are primitive polynomials in 
Z(X], then fg is primitive. 


Proof. Suppose f and g are primitive in Z[X] and fg is not primitive. 
Let p be a prime dividing c(fg). For any polynomial h in Z[LX], let h be 
the polynomial in Z,[X] obtained by mapping the coefficients of h into 
their congruence class modulo p. The map ht =h is easily seen to be a homo- 
morphism. (See Exercise 4.10.) Since p divides c(fg), all the coefficients of 
fg are divisible by p. Thus 0 = fz. But Z,[X] is an integral domain, and so 
either f = O or g = O, that is, either p divides all the coefficients of f or 
p divides all the coefficients of g. This contradicts our assumption that 
c(fy=c(g)=1. U 

The following lemma forms the foundation for the proof that ZLX] is 
a UFD. 


LEMMA 3. Let f be a primitive polynomial in Z[X]. Then f is irre- 
ducible in Z[X] if and only if fis irreducible in QLX]. 


Proof. The only primitive polynomials of degree 0 are +1. Since each 
of these is a unit in both Z[X] and Q[X], we may assume that f has posi- 
tive degree and so is a nonunit in Z[X] and in Q[LX]. We will prove the 
lemma by showing that f has a nontrivial factorization in Z[X] if and only 
if f has a nontrivial factorization in Q[X]. 

Suppose f is not irreducible in Z[X]. Then f = wv, with uw and »v non- 
units in Z[X]. If uw is a unit in Q[LX], then uw is in QN Z[X] = Z, and so 
u divides c(f) = 1. But this means that u = +1, contradicting our assumption 
that uw is not a unit in Z[X]. Therefore u and also y are nonunits in QLX] 
and f is not irreducible in Q[X]. Thus the irreducibility of f in Q[X] im- 
plies the irreducibility of fin Z[X]. 

Now suppose f is not irreducible in Q[X]. Then there exist elements 
u and v in Q[X] such that f = uv and both u and »v have positive degree. 
Since u has the form 


POLYNOMIAL RINGS OVER UFD’S 231 
Bo D Xt... Dr XxX’, 
with each a; and b; in Z, we can write 
= Play ta;X+...+a,X") 


q 
where p, g, and the a; are integers, p and g are both positive, and 


tf 


gcd(ap,.-+,4n)= 1. 


Thus u = (p/q)u’, where u’ is a primitive polynomial in Z[X]. Similarly, 
we can write v = (r/s)y’, where r and s are positive integers and y’ is primi- 
tive in Z[X]. Hence 


f=uv =(p/q)u' (r/s)v' = (a/b)u'v’, 


where a = pr and b = qs. Thus bf = au'y’. Now f is assumed primitive and 
u'v' is primitive by Lemma 2. Therefore c(bf) = b and c(au'y' ) = a, so 
a= b. It follows that f = u’y’ is a factorization of f in Z[X], with wu’ and 
y' nonunits, and so f is not irreducible in Z[X]. Thus the irreducibility of 
fin ZLX] implies the irreducibility of fin Q[X]. OU 


LEMMA 4. The irreducible elements in Z[X] are the primes in Z to- 
gether with the irreducible primitive polynomials of positive degree. 


Proof. Let f be a nonzero nonunit in Z[X]. Then f = c(/)g, where 
g is primitive. If f is irreducible, then either c(f) or g is a unit in Z[X]. 
If c(f) = 1, then f is a primitive polynomial of positive degree. If g is a 
unit in Z[X], then f is an integer and any nontrivial factorization of f in 
Z is a nontrivial factorization of fin Z[X], and so f is a prime (or the neg- 
ative of a prime). 

All that remains is to show that each prime integer p is irreducible in 
Z{X]. If p = uv, with u and v in ZLX], then u and vy must both have degree 
Q, so u and y are in Z. Since p is a prime, this means either wu or v is a unit 
in Zand hence a unit in Z[X]. OU 


LEMMA 5. Let f be a primitive polynomial of positive degree in Z[X]. 
Then f can be factored into a product of irreducible elements in Z[X], and 
this factorization is unique up to order and associates. 


Proof. First, we show that f can be factored into a product of primitive 
irreducibles. If f is irreducible in Z[X], we are done. If not, then f= uy, with 
u and vy nonunits in Z[X]. By Lemma 2, c(f) = c(u)c(v) = 1, and so u and 
vy are both primitive. Since uw and v are nonunits, they both have positive 
degree less than the degree of f By induction on the degree of f, we may 
factor u and v into products of primitive irreducible polynomials, and this 
gives the required factorization of f 


232 RINGS 


Now suppose that f=u,...u, =v, ... Vs, with each u; and »; irre- 
ducible in Z[X]. By Lemma 4, the u; and vy; are primitive and so, by Lemma 
3, they are all irreducible in Q[.X]. Since Q[X] isa UFD, we have r=s and, 
with a suitable renumbering of the factors, u; and vy; are associates in Q[X]. 
This means that Vi= (a;/b;)u; for some Qj and b; in Z. Thus bv; = Qju;. The 
content of b;v; is |b;|, while the content of a;u; is |a;|. Thus |a;| = | b;| and 
a;/b; =+1. Therefore u; and v; are associates in Z[X] and the factorization 
of fis unique up to order and associates. [ 


THEOREM 6. Z[X] is a UFD. 


Proof. Let f be a nonzero nonunit in Z[X]. Then f = c(/)g, where 
g is primitive. We can factor c(f) into a product of prime integers and, by 
Lemma 5, we can factor g into a product of primitive irreducible poly- 
nomials. Thus we can factor f into a product of irreducible elements in 
Z([X]. Suppose 


FHP. ..-DmUy... Up =41~-. On. Vs, 
where the p; and q; are primes in Z and the u; and v; are primitive irre- 
ducible elements of Z[X]. Then c(f) = pj... Dm =41-.- Qn, and som= 
n and we may assume p; = q;, 1 < i<m.Thusu,...u,=v,... vs and, by 
Lemma 5, r= and, after renumbering, u; = +v;. Thus the factorization of 
fisunique. 


As we have already stated, our proof of Theorem 6 can easily be 
generalized to prove Theorem 1. If R is a UFD, then by Exercise 10.14 
greatest common divisors exist in R. This means that we can define the 
content of a polynomial f in R[X] and we can talk about primitive poly- 
nomials in R[X]. The proofs of Lemmas 2 to 5 and Theorem 6 are changed 
only to the extent of replacing Q by the field of fractions F of R and allow- 
ing for the possibility that R has units other than +1. 

Although we have shown that Z[X] is a UFD, we do not yet have any 
algorithms for deciding whether or not a given polynomial in Z[X] is irre- 
ducible cr for obtaining the factorization when it is not irreducible. In Sec- 
tion 12 we will show that if R is a UFD with a finite set of units and there 
is an algorithm for factoring in R, then there is an algorithm for factoring 
in R[X]. 

We close this section with some remarks about linear and quadratic 
factors and with an irreducibility test in ZLX]. 


THEOREM 7. Let R be an integral domain and let f=a) ta,;X+...+ 
a,X”" be in R[X]. Suppose n = 1 and u, v, and w are in R. Then X — w di- 
vides f in R[X] if and only if f(w) = 0. If u + vX divides f in R[LX], then 
u divides dy and »v divides a, in R. 


POLYNOMIAL RINGS OVER UFD’S 233 


Proof. Since X — w is monic, we can divide f by X — w to produce a 
quotient g in R[X] and a remainder rin R such that f= (X¥ —w)q +r. Now 
f(w) = (w — w)q(w) + r=r, and so f(w) = 0 if and only if X — w divides 
f If g is any nonzero element of R[X], then the leading coefficient of 
(u + vX)g is the product of v and the leading coefficient of g. Similarly, the 
constant term of (u + vX)g is divisible by u. Thus, if u + vX divides f, then 
u divides dy and » dividesa,. U 


Suppose f is in Z[X] and f has positive degree. If the constant term 
of f is 0, then X is a factor of f If the constant term of f is not 0, then 
Theorem 7 shows that any factor of f in Z[X] with degree 1 is one of a 
finite set of polynomials u + vX, where u divides the constant term of f and 
v divides the leading coefficient. Of course, we may assume that vy > 0. For 
example, if f = —6 + 7X + 17X32 + 6X* and uw + vX divides f, then u and 
v belong to the set {+1, +2, +3, +6}. Since f is primitive, u and v must be 
relatively prime. Assuming v > O, we have 18 possibilities for u + vX. By 
trial and error, we find that 


f=(—-1+ 2X) (3 +X)(2+X+3X7). 


Ifh = aX? + bX +c is in Z[X] and a # 0, then in C[X] we can factor 
hasa(X — a) (X — B), where 


_—b ++/b? — 4ac —b —/b? — = 4ac 
7 2a ’ 


B= 2a 


Thus / is irreducible in Q[X] if and only if a and 6 are not in Q, which is 
equivalent to b? — 4ac not being a square in Z. Since 1? — 4(3) (2) = —23, 
the factor 2 + X + 3X? of f is irreducible in Q[.X] and, since it is primitive, 
it is irreducible in Z[X]. 

There are relatively few ways known for recognizing irreducible poly- 
nomials in Q[X] of degree greater than 3 without a great deal of calcula- 
tion. One test that is sometimes useful is the following. 


THEOREM 8 (Eisenstein’s Criterion). Let f =a) ta,X +... +ay,X" 
be in Z[X]. Assume that there is a prime p in Z such that p divides dy,..., 
a, 1, but p does not divide a, and p? does not divide ay. Then ff is irre- 
ducible in Q[LX]. 


Proof. Since p does not divide a,, p does not divide c(f), and so we 
may divide f by c(f) without changing the hypothesis. Thus we may as- 
sume f is primitive. If f is reducible in Q[X], we can factor f as gh in Z[X], 
with g and h of positive degree. Let :Z[X]—>Z,[X] be defined by reduc- 
tion of coefficients modulo p. Then f = gh andf has the form uX” for some 
nonzero element u of Z,. Every factor of uX” with positive degree has a 
constant term of 0. Thus g and h each have 0 as their constant term. There- 


234 RINGS 


fore the constant terms of g and h are each divisible by p, so the constant 
term a, of f must be divisible by p?, which violates our assumption. 


The polynomial f = 6 + 4X + 2X? + X° satisfies Eisenstein’s Criterion 
with p = 2 and hence is irreducible in Q[X]. Unfortunately, there are in- 
teger polynomials, such as 1 + X?, that do not satisfy Eisenstein’s Criterion 
for any prime p and are nevertheless irreducible in QLX]. 

As a final remark, we note that the Eisenstein of Eisenstein’s Criterion 
is the German mathematician Ferdinand Eisenstein (1823-1852). 


EXERCISES 


1 Compute the content of the following polynomials in Z[X}. 
(a) 63 —105X +78X?. 
(b) —154+ 1001LX + 182X2 — 286X3. 

2 The polynomial f = (4 + 2i) — (3 + HX + (—-1 + 31)X? isin R[X], 
where R = Z[i]. Compute the content of f. 


3 Considering the polynomial 
g=(1+X—X? —X3)4+(3+4X4+3X2 + 2X3)V+(24+3X4+X7)y? 


to be an element of R[Y], where R = Z;[X], compute the con- 
tent of g. 


4 Suppose f and g are in Z[X] and g is primitive. Prove that if g 
divides fin Q[X], then g divides fin Z[X]. 

5 Find all factors in Z[X] of degree 1 in the following polynomials. 
(a) 9—9X + 14x? — 8X3. 
(b) 10+9X + 6X? —4X3 — 3X7. 
(c) 2+3X — 3X? + 8X3 — 6X%. 


6 The polynomial (X¥ + X2) + (4+ 2X + 4X7)¥ + 4x? Y? + xX’Y? 
in Z,[X, Y] contains a factor which has degree 1 in Y. Find this 
factor. 


12. INTERPOLATION 


Let R be a commutative ring and let ay, ...,@m and bo,..., Dm be ele 
ments of R with the a; distinct. The process of finding a polynomial f in 
R[X] such that f(a;) = b;,i= 0, ...,m, is called interpolation. It may not 
be possible to find any f. For example, there is no polynomial in Z[X] 
whose value at 0 is O and whose value at 2 is 1. When f does exist, it is never 
unique. If g = (X¥ — ay) (X — a,)...(X — ay), then adding any multiple 


INTERPOLATION 235 


of g to f yields another polynomial with the same values at do, ...,@m. 
The following theorem shows that if we assume that R is a field and that 


f has degree at most m, then the interpolation problem always has a unique 
solution. 


THEOREM 1 (Lagrange Interpolation). Let ay,...,@m be distinct ele- 
ments of a field F and let by,...,b, be elements of F (not necessarily 
distinct). Then 


is the unique polynomial in F[X] having degree at most m and taking on 
the value b; ata;,O0 <i<m. 


Proof. The uniqueness of f follows from Corollary 9.8. To see that 
f(a) = b;, we define 
i ~ I] a= 
j#i Gj — G 


and note that g;(a;) = 1 and g;(a;) = O for all j #7. Then 


m 
f= > Digi. 
i=0 
$0 


m 


flaj)= 2% digilaj) = bjgi(aj) = by. 


Since each g; has degree m, the degree of fisat most m. LU 


As an example of Lagrange interpolation, let us find the polynomial 
fin R[X] of degree at most 2 such that f(1) = —8, f(3) = 2, and f(4) = 13. 
By Theorem 1, 


_ —8(X — 3) (X — 4) 2(X — 1) (X — 4) 13(4 — I) (X— 3) © 


I= (1 —3)(1—4) (3—1)3-4) (4—1)(4— 3) 
— (X? — 1X +12) —(X? — 5X +4) +2 (X? —4X +3) 
=2X?2 _3X —7. 


There are other methods for performing interpolation. For example, we 
could try to find the polynomial f in the previous example by writing it in 
the form 


236 RINGS 
f=cy +c, (X —1) +c, (X— 1) (X — 3). 


Then cy =f(1) = —8, and 
2=f(3) =co + 2c, = —8 + 2c). 
Thus c, = 5. Finally, 
13 = f(4) =co + 3c, + 3c, =—-—8 +154 3c), 
giving c, = 2. Thus we obtain again 
f=—-8+5(X —1)+2(X¥ — 1) (X — 3) = 2X? — 3X — 7. 


Another view of interpolation is discussed in Section 6.4. 

The procedures RXINTERP and ZXINTERP perform interpolation 
in R[LX] and Z[.X], respectively. If A and B are vectors of length V+1, then 
F+«A RXINTERP B is the vector of coefficients for the polynomial in 
R[X] of degree at most WV whose value at ALT ]is BLZ ]. Of course, the 
entries in A must be distinct. 


13 4 RXINTERP 8 213 
7 32 
O<«<F<0 1 3 5 RXINTERP 3.2 6.1 2 10 
3.2 3.235833333 0.1266666667 0.2091666667 
F RXEVAL 0 13 °5 
3.2 6.1 2 10 


Here we have repeated the previous example and also found the cubic 
polynomial whose values at 0, —1, 3, and 5 are 3.2, 6.1, —2, and 10, re- 
spectively. As usual, the results of RXINTERP are subject to round-off 
error. 

The arguments of ZXINTERP must have integer entries and, if the 
interpolated polynomial does not have integer entries, then a domain error 
is indicated. 


_  _ 1 3 4 ZXINTERP 8 2 13 
7 3 2 


The second arguments of RXINTERP and ZXINTERP may be matrices 
listing several vectors of values, so that more than one interpolation can be 
performed at one time. If A is a vector and B is a matrix with oA columns, 
then *<«A RXINTERP B is the matrix whose Jth row is the vector of 
coefficients for the polynomial in R[X] whose value at ALJ] is BLI;3~/7]. 


INTERPOLATION 237 


O+B<+2 3p 8 213 4 14 22 
8 2 13 
4 14 22 
1 3 4 RXINTERP B 
7 3 2 
2 1 1 
"7 3 2 RXEVAL 1 3 4 
8 2 13 
211 RXEVAL 1 3 4 
u 14 22 


Interpolation can be used to factor polynomials in Z[X]. Let f be 
a primitive polynomial in Z[X] with degree n > 1 and suppose f is not 
irreducible. Then f can be factored as gh in Z[X], where g and h have posi- 
tive degree and one of them, say g, has degree not exceeding n/2. For any 
integer a we have f(a) = g(a)h(a), and so g(a) is a divisor of f(a). If f(a) = 
0, then X — a is a factor of f and we may remove this factor before pro- 
ceeding. If f(a) #0, then f(a) has only finitely manv divisors, so there are 
only finitely many possibilities for g(a). We can find a factor g of f as fol- 
lows. Let m be the largest integer not exceeding n/2 and let dy,...,@m be 
distinct integers. If f(a;) = O for some i, we take g = X — a;. Otherwise, 
we consider all possible sequences by, ..., Dm, where b; divides f(a;). There 
are only finitely many such sequences. For each sequence we interpolate 
a polynomial g of degree at most m satisfying g(a;) = b;, 0 < i< m. Ifg has 
integral coefficients, we check whether or not g divides f. 

Let us use the method of interpolation to factor f=4+ 11X — 13X? + 
9X> — 6X* in Z[X]. Here n = 4 and m = 2. If we take a, = —1, a, =0, and 
a, = 1, then f(@,) = —35, f(a,) = 4, and f(a, ) = 5. Thus bo is in {+1, £5, 
+7, +35}, b, isin {+1, +2, +4}, and b, isin {+1, +5}. Since we may replace 
g by —g, we may assume that b,, the constant term of g, is positive. Thus 
there are 8(3) (4) = 96 possible choices for the sequence by, b,, b2. 

One way of proceeding is to consider each of the 96 sequences one at 
a time. For example, suppose we have b, = —5, b, = 2, and b, = —1. We find 
the polynomial g in R[LX] of degree at most 2 such that g(—1) = —S, g(0) = 
2, and g(1) = —1 by interpolating. 


O<G<+ 101 RXINTERP 52 1 
22 5 


Here we see that g = 2 + 2X + 5X? is, in fact, in Z[X]. We now check 
whether g divides f. 


238 RINGS 


Feu 11 #6139 6 
- G RXREM F 
1.104 8.536 


It does not. 


Although it would be possible to continue in this manner with the re- 
maining 95 cases, there is a better way. We can construct a 96-by-3 matrix 
listing the 96 choices as follows. 


Bo+1 15 57 7 35 35 

Bi<1 2 4 

B2<1 15 5 

T<8 3 47196 

B<+Q3 9690B0LT7TL0;)3)],B10701;)J)],B207L2;])] 


Here BLT;] isBolLJ],B10K],B2LLJ),whereZ = L+(4xkK)+12xJand 
0 < J < 7,0 < K < 2,and0 < LZ < 3.Next we interpolate in RLX] 
to find the polynomials whose values at —1, 0, and 1 are given by the rows 
of B 


Ge 10 1 RXINTERP B 
and select the rows of G that have integer entries. 
G+(A/G=LG)4G 


(In this particular example we do not get rid of any rows of G.) Next, we 
keep only those rows of G that define divisors of fin RLX]. 


G<«(A/0=G RXREM F)#G 
Finally, we remove the rows of G that do not divide fin Z[X]. 
G<«(A/Q=LQ<F RXQUOT G)#G 


G 
1 O O 
1 3 3 
4 41 2 


It is now easy to see that the factorization of fis (1 + 3X — 3X? )(4- 
X + 2X7). There was some risk in using RXINTERP, RXREM, and RXQUOT 
in the preceding computation due to possible round-off errors. However, 
because of the small numbers involved, the risk was small. 


Interpolation may be used to factor polynomials over rings other than 
Z. 


INTERPOLATION 239 


THEOREM 2. Let R be a UFD with a finite group of units that is 
explicitly known. Assume also that there is an algorithm for performing 
prime factorization in R. Then there is an algorithm for carrying out prime 
factorization in R[X]. 


Proof. Let f be a polynomial in R[X] of degree n > 0. If R is finite, 
then there are only finitely many polynomials in R[X] of degree less than 
n, and we may try each one as a possible factor of f/ Suppose R is infinite. 
Then we can always choose m + 1 distinct elements dy, ...,dQm Of R, 
where m is the largest integer not exceeding n/2. If f(a;) = 0, we have found 
the factor X — a; of f Thus we may assume f(a;) #0, O <i < m. Since there 
is a factorization algorithm in R, we can write f(a;) as p; ... p;, where the 
p; are primes in R. Any divisor of f(a;) is an associate of a product of a sub- 
set of the p;. Since we know the units of R and there are only finitely 
many, we can list the divisors of f(a;). Interpolation now gives us a finite 
set of polynomials that must contain a factor of f provided f is not irre- 
ducible. 0 | 


An induction argument based on Theorem 2 shows that there is a fac- 
torization algorithm for Z[X,,...,X,]. 


EXERCISES 


1 Use Lagrange interpolation to find the polynomial f in Q[X] with 
degree at most 2 such that f(—1) = —9, f(O) = 2, and f(2) = 18. 
Check your answer using RX TNTERP., 

2 Letdo,...,4m be distinct elements of the field F and let by,..., 
b, be in F. Suppose that f is the polynomial in FLX] of degree 
at most m such that f(a;) = b;, O < i < m. Show that f — bo = 
(X — ay)g for some g in F[X] with degree at most m — 1. What are 


the values of g at a,, ..., 4? Describe a recursive procedure for 
computing f. Use this procedure to perform the interpolation in 
Exercise 1. 


3 Find an element f of Z[X] such that f(1) = 8 mod 11), f(3) = 
1 (mod 11), and f(7) =4 (mod 11). 


4 Let F = Z;(X), the field of rational functions in one variable with 
coefficients in Z3;. Find a polynomial g in F[Y] such that g(O) = 
X,2(1) =X? — 1,and 2(2) = X3 — X?. 


5 Use the method of interpolation to factor 10 — 13X +X? + 11X? — 
6X* in Z[X]. 


*6 Factor (3 +i)+(4-)DX+(3 4+ 2X2 + (2+i)X3 +iX* in Z[i] [|X]. 


240 RINGS 


7 Let A<((?3 3 39201)-100)+10 and consider A to represent 
an element of M;(R[X]). Construct the vector D of length 7 such 
that DL TJisRDET A RXEVAL TI for TI in 17. Compare 


(17) RXINTERP D and RXDET A. 


Explain your observations. 


MODULES 


Let A be an abelian group written additively. In Section 3.2 we defined, 
for each integer nm and each element a of A, the element na of A. That 
is, we defined a map f:Z xX A—A. This map has many useful properties. 
(See, for example, Theorem 3.2.4 and Exercise 3.2.5.) If R is any ring, 
then a left module for R is an abelian group M together with a map from 
RXM to M satisfying conditions similar to those satisfied by the map 
f This chapter is devoted to a general discussion of modules, with very 
few restrictions placed on the ring R. However, many of the most important 
examples of modules arise when R is Z, a field, or the polynomial ring 
over a field. All of these rings are Euclidean domains. For rings of this special 
type it is possible to obtain much more detailed information about modules. 
This is done in Chapter 6. 


1. DEFINITIONS 


Throughout this section R will be a ring. A left R-module consists of an 
abelian group M together with a map from R X M to M, the image of the 
pair (7, u) being written ru. In addition, the following axioms must be 
satisfied for allr,s in KR and all u, vy in M. 

Ll. r(utyv)=rurtry. 

2. (vrtsju=rutsu. 

3. r(su) = (rs)u. 

4. lu=u. 


We often say that M is a left module over R. 
Let us consider some examples of left modules. 


Example I. By the additive version of Theorem 3.2.4, every abelian group 
is a left Zemodule. Thus Z,,, Z,®Z, and Z itself are all left Z-modules. 


Example 2. Let R be a subring of a ring S. If ris in R and uw is in S, then 
ru is certainly defined as an element of S. It is easy to check that S isa left 


241 


242 MODULES 


R-module. We always consider R to be a subring of R[X]. Therefore R LX] 
is a left R-module. 


Example 3. Let I be a left ideal of the ring R. If r isin R and uw is in J, then 
ru is in f, It follows immediately from the ring axioms that / is a left R- 
module. 


Example 4. Let R be any ring and let M = {0} be an abelian group with one 
element. If we define r0 to be O for all r in R, then M becomes a left R- 
module. Such a module is said to be trivial. 


Example 5. Let R = M,(Z), the ring of 2-by-2 integer matrices, and let 
M = Z? = Z @©Z, the abelian group whose elements are integer vectors of 
length 2 with addition performed componentwise. For A in R and wu in 
M, the matrix product Au = A+. Xu is in M. We leave it as an exercise to 
check that this product makes M a left R-module. The following dialogue 
verifies special cases of the axioms A(u + v) = Au+Ayv and A(Bu) = (AB)u. 


O+A+2 202 11 3 


2 1 
1 3 
O<B+2 294 1 3 °2 
yn 64 
3 «2 
U<+3 2 
V+ 1 4 
A+.xJU 
4 Q 
A/(At+.xU+V)=(A+.xU)+A+.xV 
1 
A/(At+.XB+t.xU)=(At+.xB)+.xU 
1 


It is not hard to show that, for any ring S, the same construction makes 
the direct sum S” = S®...@®S with n summands into a left M,, (S)-module. 


Example 6. The arrays PLUS and TIMES in EXAMPLES are the addition 
and multiplication tables, respectively, for a ring R with 8 elements. The 
matrix MPLUS is the addition table for an abelian group M of order 4. The 
matrix RMOD has shape 8 4 and has entries in 14 in origin 0. 


oRMOD A/,RMODeE14 


OIT0<+0 


DEFINITIONS 243 


Thus we may consider RMOD to define a map from R X M to M. The cal- 
culation 


RMODL3;MPLUSL1;2)J]=MPLUSLRMODL33;1)1;RMODL3;2])] 
1 


verifies the module axiom r(u + vy) = ru + ry for the caser=3,u=1,y=2. 
Similarly, 


RMODL4;RMOD(L2;3)])]=RMODLTIMESL432)133) 
1 


checks the axiom r(su) = (rs)u for r = 4, s = 2, and u = 3. We leave it to the 
reader to formulate each of the module axioms as a single APL statement 
and to verify that PLUS, TIMES, MPLUS, and RMOD satisfy all of these 
axioms. 


The following theorem lists some elementary results that hold for any 
left module. 


THEOREM 1. Let ™ be a left R-module. For all r in R and all uw in 


M,we have 
(a) rO=0. 
(b) Ou = 0. 


(c) (—r)u = —(ru). 
(d) If risa unit and ru = 0, then u = 0. 


Proof. The proofs of (a), (b), and (c) require only slight modifications 
of the proof of Lemma 4.1.2a, b. Note that in (a) both zeros refer to the 
additive identity of M, while in (b) the first zero refers to the additive iden- 
tity of R. Note also that in (c) the first minus sign refers to negation in 
R while the second minus sign refers to negation in M. To prove (d), assume 
risaunitin R and ru = 0. Then 


u=lu=(rtnu=rit(ru)=rt0=0. O 


So far we have talked only about left modules. It is clear that the term 
“right R-module” should mean an abelian group M and a map (u,r)l =ur of 
M X R to M such that the analogues of axioms | to 4 hold. But do we really 
need to distinguish between right and left modules? Suppose M is a left 
R-module. Can’t we make M into a right R-module by defining ur to be 
ru for all wu in M and all r in R? The answer is that we can if R is commuta- 
tive, but we run into serious problems otherwise. For example, one of the 
right module axioms is (ur)s = u(rs). If we have defined ur to be ru, then 


(ur)s = s(ur) = s(ru) = (sr)u = u(sr). 


244 MODULES 


Thus (ur)s = u(sr), not u(rs). Of course, if R is commutative, everything is 
all right. The following theorem gives the whole story. Recall that the op- 
posite ring R°? of a ring R is obtained by defining a new multiplication 
o onkR, withr o s=sr. 


THEOREM 2. Let M be a left R-module. If we define ur to be ru for 
all vu in M and ally in R, then M becomes a right R°? module. 


Proof. The axiom that gave us trouble before was the third. Now: we 
have (ur)s = u(sr) =u(r o S). The other axioms are easily checked. L 


If R is commutative, then R and R°? are the same ring. Thus every 
left R-module is a right R-module and every right R-module is a left R- 
module. In this case we are justified in referring simply to R-modules and 
using either right or left notation, whichever is most convenient. How- 
ever, if R is not commutative, we must be careful to make the distinction 
between right and left R-modules. For example, let. S be any ring, R = 
M,,(S), and M = SS". If A isin R and uw is in M, then defining Au to be 
A+.Xu makes M into a left R-module and defining uA to be ut+.XA 
makes M into a right R-module. These two module structures on M are 
similar in many ways, but they are different and cannot be identified or con- 
sidered the same. 

A vector space is a module over a field. Historically, vector spaces were 
among the first modules to be considered. As we continue our study of 
modules, we will come across several examples of a concept with two names, 
one used when referring to modules over arbitrary rings and the other 
used only in connection with vector spaces. This is a case of the older 
vector space names coexisting with more modern terminology. Since it 
seems likely that the vector space names will continue to be widely used, 
we will perpetuate this dual terminology here. 

If we are to continue the pattern of Chapter 3, where we defined 
groups, subgroups, and quotient groups, and of Chapter 4, where we de- 
fined rings, subrings, and quotient rings, then we should next define sub- 
modules and quotient modules. Let M be a left R-module. An R-submodule 
of M is a subgroup WN of the abelian group (V/,+) such that ru is in N for all 
rin R and all u in N. These conditions ensure that N is a left R-module 
in its own right. If the ring R is clear from context, we simply call N a 
submodule of M. If R is a field, so that M is a vector space over R, then 
a submodule of / is usually called a subspace. 

Here are some examples of submodules. 


Example 7. \f A is an abelian group, then any subgroup of A isa Z-sub- 
module of A. 


Example 8. Let M = Z* be made into a left M,(Z)-module, as in Example 


DEFINITIONS 245 


5. The set N of all vectors (x,y) in M with x and y both even is an M,(Z)- 
submodule of M. However, the set L of (x,y) in M with x even is not an 
M,(Z)-submodule, as the following calculation shows. 


O<A<+2 20112 1 


1 1 
2 1 
U<+2 3 
A+.xJU 
1 1 


Here U is in L, but A+.xU is not in L. It is true that L is a Z-submodule 
of M. 


Example 9. If we consider a ring R to be a left module over itself, then the 
submodules of R are the left ideals of R. 


Given two submodules of a left R-module, we can construct two addi- 
tional submodules. 


THEOREM 3. Let U and V be submodules of a left R-module ™. 
Then U+ Vand UN V are submodules of MW. 


Proof. By definition, U+V = {ut+v|ueU,ve V}. Since U and V 
are subgroups of (M,+), it follows that both U + V and UN V are sub- 
groups of (V,+). If re R, u e U, and ve V, then r(u t+ v) = ru +7 is in 
U+V,and so U + V is a submodule. If w isin UN V, then w isin U, and 
so rw is in U_ Similarly, rw is in V, and so rw isin UN V. Hence UN V is 
asubmodule. U 


Let X be a subset of a left R-module M. An element u of M is said to 
be an R-linear combination of the elements in X if u can be written as 
ryX, + ...+%X,, where each rv; is in R and each x; isin X. Note that 
although X may be infinite, only finitely many elements of X are used to 
obtain u. It is easy to see that the set N of all R-linear combinations of the 
elements of X is a submodule of M. We say that N is the submodule gen- 
erated by X and denote it by <X >. If X = {x}, then <X >= Rx = {rxlre 
R}. If X = {x,,...,Xn} is finite, then 


<X > =< XY, Xn OH RX, + ...¢+ RXy.- 


A submodule is called cyclic if it is generated by a single element and finitely 
generated if it has a finite generating set. In the context of vector spaces, 
one calls < X > the subspace spanned by X, and one speaks of a subspace 
spanned by a finite set as being finite dimensional. 


246 MODULES 


Let N be a submodule of a left R-module M. Since N is a subgroup of 
(M,+), we can form the quotient group M/N as an abelian group. 


LEMMA 4. If for all uw in M and all rin R we define r(N + u) to be 
N+ru, then M/N becomes a left R-module. 


Proof. We must first show that this module multiplication is well 
defined. Suppose N + u = N+ py. Then u — v isin N. Since N is a submodule, 
it follows that r(u — v) = ru — rv is in N and, hence, that N + ru =N +70. 
The verification of the module axioms now becomes quite straightforward. 
For example, for all u and v in M and allr in R, we have 


r[\(Nt+u)t+(Nt+yv)] ="(Ntutvy=Nt+r(uty) 
=Ntrutrv=(Nt+ru)t+(Nt+r) 
=r(Nt+u)t+r(Ntv). 
Checking the remaining axioms is left as an exercise. LU 


The module M/N is called the quotient module of M modulo N. Since 
quotient modules can be formed modulo any submodule, it is best to think 
of submodules as the analogues of normal subgroups or (two-sided) ideals 
instead of as the analogues of subgroups or subrings. With vector spaces, 
the term “‘quotient space”’ is used instead of “‘quotient module”. 

After defining submodules and quotient modules, we take the obvious 
next step and define homomorphisms from one module to another. This 
is done only for modules over the same ring. Let M and N be left modules 
over the ring R. An R-homomorphism from M to N is a map f:M—>N such 
that for all rin K and all u,v in M we have 


l. (utv)f=uf + vf. 
2. (ru)f=r(uf). 


Thus, in particular, f is a homomorphism of abelian groups. The map taking 
each element u of M to Oin N is an R-homomorphism, as is the identity map 
on M. If L is a submodule of M, then u FL + u is an R-homomorphism 
of M onto M/L. If R is commutative and a in R, then the evaluation map 
at a, under which a polynomial g in R[X] is mapped to its value g(a), is 
an R-homomorphism of R[X] onto R. As with group or ring homomorph- 
isms, the composition of two R-homomorphisms is again an R-homomorph- 
ism. If V and W are vector spaces over the same field F, then an F-homo- 
morphism from V to W is usually called a linear transformation. 

If f:“—N is an R-homomorphism, then the kernel of f is, as usual, 
{O}f—!, the set of all elements u in M such that uf = 0. 


THEOREM 5. Let f:M@—N be an R-+homomorphism and let K and 
L be submodules of M and N, respectively. Then Kf is a submodule of 


DEFINITIONS 247 


N and Lf is a submodule of M. In particular, the kernel of f is a sub- 
module of M. If fis bijective, then f~! is an R-homomorphism from N to M. 


Proof. By Theorem 3.4.1, Kf.and Lf~! are subgroups of (N,+) and 
(M,+), respectively. If u is in Kf, then u = vf for some v in K. Then, for 
any r in RK, we have ru = r(vf) = (rv)f and, since rv is in K, it follows that 
ru is in Kf. Thus Kf is a submodule of N. Similarly, if x is in Lf, then 
y = xf is in L. Therefore (rx)f = r(xf) = ry is in L. Hence rx is in Lf 
and Lf is a submodule of M. The kernel of f is the inverse image of the 
trivial submodule {0} of N. Thus the kernel of f is a submodule of M. Fi- 
nally, suppose f is bijective. Then, by Theorem 3.4.3, f7! is ahomomorph- 
ism of (N, +) onto (M, +). If u is in N, then u = yf, where vy =uf. Forr in 
R we have (rv)f = r(vf) = ru: Therefore (ru)f7 = rv =r(uf). Thus f~ is 
an R-homomorphism. 0 

A bijective R-homomorphism is called an R-isomorphism, and two R- 
modules are R-isomorphic if there is an R-isomorphism from one to the 
other. The three basic theorems on group isomorphisms carry over to R- 


isomorphisms. The First Isomorphism Theorem becomes the following 
theorem. 


THEOREM 6. Let f:M—N be a surjective R-homomorphism with 
kernel K. Then K + ul uf defines an R-isomorphism g of M/K onto N. 


Proof. By Theorem 3.5.7, we know that g is well defined and g is an 
isomorphism of abelian groups. If risin R and u is in M, then 


[r(K +u)] g=(K tru)g = (ru) f 


=r(uf)=r[{(K +u)gl. 
Therefore g is an R-isomorphism. 


The other two isomorphism theorems are covered by the following 
theorem. 


THEOREM 7. Let U, V, and W be submodules of the left R-module 
M with V C W. Then 


(a) (U+ V)/U is R-isomorphic to V/(UN V). 

(b) W/V is a submodule of M/V, and (M/V)/(W/V) is R-isomorphic 
to M/W. 

Proof. See Exercise 19. OU 


Let M and WN be left R-modules. In Section 3.6 we defined M ® N to 
be the abelian group with M X N as its set of elements and (u,,v,) + 
(U>,V,) = (uy + uy, v, + v2). We can make M © N into a left R-module 
by defining r(u,v) to be (ru,rv) for allrin R and all (u,v) in M X N. We call 
this module the external direct sum of M and N. 


248 


MODULES 


As with groups, there is a notion of an internal direct sum of modules. 
If U and V are submodules of a module M, then the map f:(u,v) Feu tv 
is an R-homomorphism of U ® V into M. We say M is the internal direct 
sum of U and V and write M= U © V if f is an R-isomorphism. This is equiv- 
alent to saying that WM =U+V and UN V= {0}. 


EXERCISES 


] 
2 


12 


Verify that the axioms for a left module hold in Examples 2 to 5. 


Formulate each of the four left module axioms as a single APL 
proposition using the four matrices in Example 6. Enter these 
propositions at a terminal and check that they do hold for this ex- 
ample. 


Complete the proof of Theorem 1. 


Let R be the ring and M the R-module of Example 6. Construct 
the multiplication table for R°? and the matrix describing the 
structure of M as a right R°? module. Formulate the right module 
axioms as APL propositions and verify them for M and R°P., 


Let R be a ring and M an abelian group, and suppose that (r,u) Fe 
ru is a map of R X M to M satisfying the first three axioms for a 
left module but not necessarily the fourth. Set K = {ueM | lu=u} 
and L = {u e M| lu = 0}. Show that as an abelian group M = K ® 
L. Show also that K is a left R-module and ru = 0 forall rin R and 
all u in L. 


Let f:R—S be a ring homomorphism and let M be a left S-module. 
Show how M can be made into a left R-module. 

Let V be a vector listing the elements of a subset of 14 in origin 0. 
Write an APL proposition corresponding to the statement that 
V is a submodule of the module in Example 6. 


Show that the module in Example 6 has no proper nontrivial 
submodules. 


Prove that the subset N in Example 8 is a submodule. 
Prove the statement made in Example 9. 


Let X be a subset of a left R-module M. Show that the set N of all 
R-linear combinations of the elements in X is a submodule of 
M.Is N adequately defined when X =,2? What is a reasonable defi- 
nition of < @>? 


What is the submodule of the module in Example 5 generated by 
(3,0)? 


FREE MODULES 249 


13. Complete the proof of Theorem 4. 


14. Let N be a submodule of a left R-module M. Show that there is a 
1-1 correspondence between the submodules of M/N and the sub- 
modules of M that contain N. (Compare Theorem 3.5.9.) 


15 Complete the proof of Theorem 2. 


16 Let M be a left module over the commutative ring R. Show that 
for any r in R the map u Fv is an R-homomorphism of M into 
itself. Thus {ruju € M} is a submodule of M. 


17 Prove that Q is not finitely generated as a Z-nodule. 


18 Let d be a vector of length 4 with components in 14 in origin 0. 
Write an APL proposition corresponding to the assertion that 
H is a module homomorphism of the module in Example 6 into 
itself. 


19 Prove Theorem 7 by making the appropriate changes in the solu- 
tions to Exercises 15 and 16 of Section 3.5. 


20 An R-module ™ satisfies the ascending chain condition (ACC) 
on submodules if there are no infinite sequences U, C U, C... 
of submodules, with each U; a proper subset of U;+,. Show that 
M satisfies the ACC for submodules if and only if every submodule 
of M is finitely generated. (See Theorem 4.9.6 and Exercise 4.9.11.) 


21 Suppose an R-module M satisfies the ACC on submodules and let 
U be a submodule of M. Show that M/U satisfies the ACC on 
submodules. 


2. FREE MODULES 


Throughout this section we will assume that R is a nontrivial ring, that is, 
1#0. Let x,,...,X, beasequence of elements of a left R-module M. We 
say that x,,...,X, are linearly independent over R if the only way 0 can 
be written as an R-linear combination r,;x,; + ...+/7,x, of the x; is the 
obvious way: with r; =r, =...=/7r, = 0. For example, the real numbers 
1 and ,/2 are linearly independent over Q. To see this, suppose r and s 
are rational numbers and rl + s,/2 = 0. If s #0, we can divide by s to ob- 
tain ,/2 = —r/s. However, ,/2 is irrational and —r/s is a rational number. 
Thus s must be 0, and so r = 0 too. The equation (,/2)1 + (—1)./2 = 0 
shows that 1 and ,/2 are not linearly independent over R. We say that 
1 and ./2 are linearly dependent over R. If i is a complex number with 
i? = —], then a similar argument shows that 1 and i are linearly independent 


over R. but not over C. 


250 © MODULES 


THEOREM 1. Let M be a left R-module generated by x,,...,Xpz. 
Ifx,,..+.+,Xpn are linearly independent over R, then each element of M can 
be written as an R-linear combination of the x; in exactly one way. 


Proof. Suppose 7,,...,/7, and s,,...,5, areinR andr,x, +... 
InXyn =S8 {X11 +...4+5,X,. Then 


(7, —$,)Xy +...+ Un — Sn )Xpn = 0. 


Since the x; are linearly independent over R, this means that for 1 <i < 
n we have r; — s; = 0 and, hence,r;=s;. U 


We can extend the notion of linear independence to arbitrary subsets 
of a module / as follows. A subset X of M is said to be linearly independent 
over K if every finite sequence x,,...,X, of distinct elements of X is 
linearly independent over R. By this definition, the empty sel @ is always 
linearly independent. (See Exercise 7.) 

A basis for M is a linearly independent generating set. We say that 
M is a free R-module if M has a basis. For example, in R” let x; = (0,..., 
O, 1, 0, ...0), where each component is 0 except for the ith, which is 1. 
Since 1 # 0, the x; are distinct. If (r,,...,7,) isin R”, then(7,,..., 
ln) =1\X1 + ...+7,X,, 80 the x; generate R”. Moreover, ifr,;x, +... + 
inX, = O, then each 7; is O, so the x; are linearly independent. Thus 
X1,+.+.,Xy, is a basis for R”. We call this the standard basis for R" . Not 
every left R-module is free. In the Z-module Z, we have 2u = 0 for all wu in 
Z,. Thus no nonempty subset of Z, is linearly independent, and Z, does 
not have a basis. 

As a further example, let us consider the Z-submodule M of Z? gen- 
erated by 


uj =(3,—1, 2), 
Ur = (0, 2, 5), 
Uz = (0, O, 4). 


We will show that u,, u,, uz are linearly independent over Z, so M is a 
free Z-module with basis u,, u,, u,. To check linear independence, we 
assume ’,, 5, ’3 are integers such thatr,u, +r,u, + 7r3u3 = O. The first 
component of r,;u,; + r,u, + 7r3uU3 is 3r,, so r; must be 0. The second 
component of r,;u,; + 7r,u, + r3u3 is thus 2r, and, therefore, r, = 0. Fr 
nally, the third component of r,;u,; +r,u, + 7r3u3 is 4r3, andr, = 0. Thus 
U,,U,,U3 are linearly independent and M is a free Z-module. 


THEOREM 2. Let ™ be a free left R-module with basis X and let 


N be a left R-module. If f:X—>N is any map, then there is a unique R- 
homomorphism g:M—N such that xg = xf for all x in X. 


FREE MODULES 251 


Proof. We will prove the theorem only in the case in which X is fi- 
nite, leaving the infinite case as an exercise. Suppose |X| = m and X = 
{X1,+-++,Xn}. Set x;f = y;. If u is in M, then u can be written uniquely as 
ryX, +...+%),X,, where each; isin RK. If g exists, then 


ug =(r1X, +... t+ MnXn)B Hr (X18) +... + n8) 
=ri(xyf) +... tm Onf) Hriyi +... t+ nyn- 
Thus g is unique, and all that remains is to show that the map g taking 


U=7yX, +... +¢% Xp, toryy, +... t+%,¥p iS an R-homomorphism. Sup- 
pose vy =S,;xX, t...+5,X,, with each s; in R. Then 


(u+v)g= [> (7; + six | g= XL oi +s)y 


= x ViVi t > Si¥i = ug t+ vg. 
I 


1 
Similarly, (ru)g =r(ug) forallrinR. O 


COROLLARY 3. Let N be a left R-module generated by n elements. 
Then N is isomorphic to a quotient module of R” . 


Proof. Let N be generated by y,,...,¥, and let x,,...,X, be the 
standard basis for R”. By Theorem 2, the map g taking (r,,...,%,) = 
ryX, +t... tryX, tory, +... + %,¥_ is an R-homomorphism of R” 
into N. Since the y; generate N, the map g is actually surjective so, by The- 
orem 1.6, N is isomorphic to R” /K, where K is the kernel of g. D 


We have seen a special case of Corollary 3 in Exercise 3.6.5. 


COROLLARY 4. If N is a left R-module possessing a basis with n 
elements, then N is isomorphic to R” . 


Proof. Let y,,..-,¥, be a basis for N and let g:R” —>N map (7,,..-, 
ty) to ry¥, +... + yyy. Then, as in the proof of Corollary 3, g maps 
R" onto N. Suppose (r;, ..., 7%) is in the kernel of g. Thenr,y, +...+ 
r Yn = 0 and, by the linear independence of the y;, this implies that r,; = 
ro =...= 7, = 0. Thus the kernel of g is trivial and g is injective. There- 
fore g isan R-isomorphism. 0 


Corollary 4 shows that every free left R-module N with a finite basis 
is isomorphic to R” for some integer n. If R is finite, then |N| = {R|” and, 
as |R| > 1, the integer n is uniquely determined. It is natural to ask whether 
n is always unique. Put another way, can R™ be isomorphic to R” asa left 
R-module when m # n? We will provide an answer to this question later 
in this section. 

For each positive integer n the module R” is a free module having a 


252 MODULES 


basis with n elements. Free modules with infinite bases exist also, and every 
R-module is isomorphic to a quotient module of a free R-module. (See Exer- 
cises 8 and 9.) For vector spaces, things are especially nice. 


THEOREM 5. Let V be a vector space over the field F. Then V is a free 
F-module. 


Proof. We will assume that V is finite dimensional. The theorem is true 
without this assumption, but the proof in the infinite dimensional case is 
beyond the scope of this text. 

Since V is finite dimensional, there is a finite subset X of V that spans 
V. Among all such sets X choose one such that n = |X| is as small as pos- 
sible. If mn =0, then X = @ and X is a basis for V. (See Exercises 7 and 1.11.) 
Thus we may assume nv > 1. Let X = {x,,...,X,} and suppose X is nota 
basis for V. Then the elements of X must be linearly dependent over F’, and 
so there exist elementsr,,...,/7, of F such thatr,x, +...+7,x, = 0 and 
not all of the 7; are 0. Let 7; be the first nonzero coefficient. Since F is a 
field, r; has a multiplicative inverse in F. Multiplying by 7;~!, we obtain 

Xp tril rie Xia, +... trp rpx, =0 
or 

Xj = Vy" Visi X41 —--.— Mi nXn- 
Therefore x; isin <<X — {x;} >, and V is spanned by X — {x,;}, contradicting 
our choice of X. Thus.X is a basis for V and V isa free F-module. U 

Let A be an m-by-n matrix with entries in R. Each row of A is an 
element of R”, so we may consider the submodule of R” generated by the 
rows of A. We will denote this submodule by Sp (A) or, simply, S(A) when 
the ring R is clear from context. If we use APL indexing for A, then a typical 
element of Sp(A) is r,A[1;] + ...+7,A[m;] or ut .XA, where u = 
(r,,-.++,'%m) isin R” . Since the rows of the n-by-n identity matrix J are the 


standard basis for R”, we have Sr (J) = R”. In the following discussion we 
will usually suppress the symbol + . X for matrix multiplication. 

THEOREM 6. Let A and B be /-by-m and m-by-n matrices over R, re- 
spectively. Then Srp(AB) C Sp(B). If | = m and A is a unit in M,, (R), 
then S(AB) = S(B). If C is any matrix over R with n columns such that 
Sr (C) C S(B), then there exists a matrix D such that C = DB. 


Proof. The ith row of AB is A[i;]B, which is in Sp (B). Thus Sp (AB) © 
Sr(B). fA is a unit in M,, (R), then 


Sp (B) = Sr (A™(AB)) C Sp (AB), 


and so Srp(AB) = Sp(B). Finally, suppose C is a matrix with n columns 
such that Sp(C) C Sr (B). Then each row of C is in Sp (B) and, for each 


FREE MODULES 253 


row index i of C, there is a vector D[i;] such that C[i;] = D[i;]B. However, 
this means that C=DB. 


COROLLARY 7. If R is commutative and n is a positive integer, then 
a matrix A in M,(R) isa unit if and only if Sp(A)=R”. 
Proof. If A is a unit and J is the n-by-n identity matrix, then 
R™ = Sp (1) = Sp(A7'A) C Sp(A), 


and so Sp(A) = R”. On the other hand, suppose A is in M, (R) and Sp(A) = 
R". Then Sp (J) C Sp(A) and, by Theorem 6, there is a matrix C such that 
CA =I. Now C is in M,(R) and, if we take determinants, we obtain (det 
C) (det A) = det J = 1. Thus det A isa unit in R and A is a unit in M,, (R) by 
Theorem 4.7.11. O 


Let us illustrate the usefulness of Corollary 7 with an example in which 
R= Z. Let A be the following matrix. 


O<A+3 3p1 3125 112 °1%4 


1 3 1 
2 5 4 
1 21 


The determinant of A is —1. 


ZLDET A 


1 


Since — 1 is a unit in Z, the rows of A generate Z?. 
We can derive other important facts from Corollary 7. 


COROLLARY 8. If R is commutative, then every generating set of 
R" has at least n elements. 


Proof. Suppose u,, ..., Um, generate R” and m <n. Let A be the 
n-by-n matrix such that A[i;] =u; for 1 < i < mand A[i;] is the zero vector 
form <i <n. Then Sp(A) = R” and so, by Corollary 7, we know that 
A is a unit in M, (R). However, A[n;] is the zero vector, so det A = 0, which 
is not a unit in R. This contradicts Theorem 4.7.11. 0 


COROLLARY 9. If R is commutative, then every basis of R” has n 
elements. If R” is isomorphic to R”, then m =n. 


Proof. First, we will show that R” has no infinite bases. Suppose Y is 
an infinite basis for R”. Let x,,...,X, be the standard basis. Then Y gen- 
erates R” and each x; is an R-linear combination of a finite sequence of 
elements of Y. Thus there is a finite subset Y, of Y such that x; isin< Yo > 


254 MODULES 


for each i, But this means that < Y, > = R”. If yisin Y — Yg, then y can 
be expressed as a linear combination of the elements in Y,. Therefore 
Y, U {y} isa linearly dependent set and Y cannot be a basis for R” . 

Now suppose y,,...,¥m is a basis for R”. By Corollary 4, R” = 
R™ . By symmetry, we may assume m < n. If m <n, then R” is generated 
by fewer than n elements, contradicting Corollary 8. U 


The assumption of commutativity in Corollary 9 is essential. Exer- 
cise 3.11 demonstrates the existence of a noncommutative ring R such that 
R" is isomorphic to R as a left R-module for all n > 1. 

If M isa finitely generated free module over a commutative ring R, then 
M is isomorphic to R” for a unique value of n. We call n the rank of M. In 
the case of a finite dimensional vector space V over a field F, we usually 
speak of the dimension of V instead of the rank and denote it dimr(V). 

Let M be a free left module over a commutative ring R with basis 
X1,...+,Xy,.If y is in M, then there exist unique elements 7,,...,/7, of 
R such that y =7r,x, +...+/7,X,. The vector u = (7,,..-,%) in R” is 
called the coordinate vector of y with respect tox,,...,X,. It is clear that 
u is the image of y under the R-isomorphism of M onto R” taking x,,..., 
X, to the standard basis of R” . Now suppose y,,...,¥, is another basis for 
M. Let P be the n-by-n matrix over R such that P[i;] is the coordinate vector 
of y; with respect tox,,...,X,. That is, 


yi = > Piyx;, Ilsisn. 
j 


We call P the transition matrix from the basis x,,...,Xn to the basis 
v1 pert eg Yn : 


LEMMA 10. The matrix P is a unit in M,(R). Moreover, P™ is the 
transition matrix from y,,...,Vy tOX,,...,Xn. Letz be in M and let 
u and v be the coordinate vectors for z with respect to x,,...,X, and 
Vi,+++,¥n, respectively. Then u=vPandv=uP™. 


Proof. Let OQ be the transition matrix from y,,...,V¥yn tOX1,-.+,Xn- 
Then we have 


Yi= > Pixj, s<isn, 
j 


x= DY Qiyy;, sin. 
j 


Therefore 


FREE MODULES 255 


xj = >. Qi; »» PipXk 


» (5 OijPx) Xk 


Since x; can be written as a linear combination of x,,...,X, in only one 
way, it follows that 


> Qi Pi, = 9, i#xk, 
] 


> QijPji = 1. 


Thus QP is J, the n-by-n identity matrix. Similarly, PO = I, so Q and P are 
inverses of each other in M,(R). If z =v ,y, +... ny,, then 


Z> > ViVi = > Vi > P; 5X; 
l l ] 


> (> v;P;;) Xj. 


Therefore z =u ,X, t...+uU,X,, where 


uj = > v{Pi;- 


l 


Thus, if u = (w,...,U,) andv=(1,,...,¥,),then u = vP. Similarly, vy = 
uQ=uP". U 
The next lemma is a partial converse of Lemma 10. 


LEMMA 11. Let P be a unit in M,,(R), let x,,...,x, bea basis of M, 
and set 


yz, = d Pix;, l<is<n. 
J 


Then y,,...,¥, is a basis for M. 
Proof. Let Q = P 1! . Then, for 1 <i <n, we have 


> Qi; = x Oi »» PipXk 


256 MODULES 


» (3 ovr) Xf 


j 


» (OP) i4Xk =Xi, 


since OP =I/, Thus < y,,...,YVy, > contains each x;, and so y,;,...,)n 
generate M, 


We must now show that y,,..., ¥, are linearly independent over 
R. Supposer,,...,/’, areinR andr ,y, +...+/7,y, =O. Then 


O= »y ViVi = »y rj > Px; 
l l ] 
= 2 (x ri) Xj. 


J] I 


Since X,,...,X, are linearly independent over R, this means that 
»y r;Pi; = 0, lL<j<n. 
l 


Thus, ifu =(7,,...,/%,), then uP = 0. But then u = u(PQ) = (uP)O = 02 = 
O, since PO = J, Thereforer,; =...=r, =Oandy,,..., yp, are linearly 
independent. 0 


The following theorem combines Lemmas 10 and 11 with facts pre- 
viously obtained to form the main result of this section. 

THEOREM 12. Suppose R is a commutative ring and P is in M,(R). 
Then the following are equivalent. 

(a) Pisa unit in M,(R). 

(b) det P is a unit in R. 

(c) The rows of P generate R”. 

(d) The rows of P are a basis for R” . 


Proof. We know that (a) and (b) are equivalent by Theorem 4.7.11. 
Let x,,...,X, be the standard basis of R” . Then, using APL indexing for 
P, we have 


Plis) =  Plisilx;. 
] 


Lemmas 10 and 11 now show that (a) and (d) are equivalent. Finally, 
Corollary 7 states that (a) and (c) are equivalent. [ 


Note that if P satisfies any one and hence all of the conditions of 


FREE MODULES 257 


Theorem 12, then P is the transition matrix from the standard basis of 
R" to the basis consisting of the rows of P. 
Theorem 12 has two important corollaries. 


COROLLARY 13. If M@ is a free module of rank nm over a commuta- 


tive ring R, then elements x,,...,X, Of M form a basis for M if and only 
ifx,,...,Xy, generate M. 


Proof. By the equivalence of conditions (c) and (d) of Theorem 12, the 
corollary holds in R” and hence in any module isomorphic to R”. (See 
Exercise 10.) 


COROLLARY 14. Suppose R is a commutative ring and M and WN are 
free R-modules of rank n. Then any surjective R-homomorphism from 
M to N is injective. 


Proof. Let x,, ...,X, bea basis of M and suppose f:M>N is a sur- 
jective R-homomorphism. Set y; = x;f, 1 <i<n. Thenx,,...,xX, generate 
M, and N, which is the image of M under f, is generated by y,,...,)¥,.By 
Corollary 13, this means that y,,...,¥, form a basis of NV. Now suppose 
uU=7yxX, t...t7,X, is in the kernel of f. Then O=uf=r,y, +...+ 
Yn. Since y,;,..., Yn are linearly independent over R, this means that 
ry =... =F, = 0, and so u = 0. Thus the kernel of f is trivial and f is in- 
jective. O 


We need to point out that, under the assumptions of Corollary 14, 
injective homomorphisms from M to N need not be surjective. For ex- 
ample, the map f:Z-~Z with xf = 2x is an injective Z-homomorphism that 
is not surjective. 


The final results of this section concern submodules of finitely gen- 
erated modules. 


THEOREM 15. Let R be a ring in which every left ideal is finitely 


generated. Then, for all integers n > 1, every submodule of R” is finitely 
generated. 


Proof. Let U be a submodule of R”. For 1 <i <n, let U; be the subset 
of U consisting of the elements whose first i-1 components are 0. Each 
U; is a submodule of U. Now let S; be the set of ith components of the 
elements in U;. If a and b are in S;, then there exist elements u=(0,...,0, 
a,...)andvy=(0,...,0,b,...)in U;.IfrisinR, thenru=(0,..., 
QO,rva,...)andut+yv=(0,...,0,a+b,...) arein U;, and sora andat+tb are 
in S;. Therefore S; is a left ideal of R. By assumption, S; is finitely generated. 
Let ay, 1 <j < n;, be a set of generators of S;. For each pair (i,j), with 
1 <j < n,, select an element u; = (0,...,0,a,;,...) in U; whose ith com- 
ponent is a;;. We will now show that U is generated by the u;,. 


258 MODULES 


Suppose w is in U = U,. Let a be the first component of w. Then 
a is in S,, and so there exist 7}, ...,%,, Such thata=r,a,, +...+ 
'n,41n,- The first component of r;u¥,, +... +m Yin , 184, and so the 
first component of 


W2 =W—hyUyy —...— In Yin, 


is 0. That is, w, is in U,. Now we consider the second component b of 
w,. Since b is in S,, there are elements s;,...,S,, Of R such that s,a2; + 
...+5n,4.n, = 5. Therefore 


W3 ='W, — 81491 —... — 5n442n5 


is in U;. Proceeding in the same manner, we are able to continue subtract- 
ing elements in the submodule V generated by the u;; from w until we get 
the zero vector. Thus w is in V and V = U. Therefore U is finitely gener- 
ated. 0 


COROLLARY 16. If R is a PID, then every submodule of R” can be 
generated by n elements. 


Proof. If R is a PID, then in the proof of Theorem 15 we may take 
each n; to be 1, so the submodule U of R” is generated by u,,,..-,Unn- U 


It follows from Corollary 16 that every Z-submodule of Z> can be gen- 
erated by three elements and that every R-subspace of R* can be generated 
by four elements. 


EXERCISES 


In the following exercises R is a nontrivial ring and M is a left R-module. 
Procedures such as ZDET and ZMATINV may be used in solving the com- 
putational exercises. 


1 Let X be a linearly independent subset of M. Prove that any sub- 
set of X is linearly independent. 


2 Prove that dis a linearly independent subset of M. 
3 Show that {0} is a linearly dependent subset of /. 


4 Let Y be a linearly dependent subset of M. Show that any subset 
of M containing Y is linearly dependent. 


5 LetA be the matrix 


10 


11 


12 


13 


FREE MODULES 259 


Show that the rows of A are linearly independent over Z. 


Let A be the matrix in Exercise 5. Show that u = (6,0, —14, 18, 5) 
is in S7(A) and find the coordinate vector of u with respect to 
the basis of S'7 (A) consisting of the rows of A. 


Let X be any set. Show that there is a free left R-module having 
a basis that can be put in 1—1 correspondence with the elements 
of X. (Hint. Consider certain functions from X to R. ) 


Let A be the matrix of Exercise 5 and let 
4 -l1-l11 13 2 
B= {1-2 2 &8-I11 10 
6 3 -9 17 3 
Find a 3-by-3 integer matrix C such that B = CA. 
Show that Q is not a free Z-module. 


Suppose M is free with basis x,,...,X, and assume f:M—N is 
an R-isomorphism of left modules. Show that WN is free with basis 
Vib, ~++ nf. 

Explain the difference between the following two assertions. 

(a) The elementsx,,..-.,xX, form a basis of M. 


(b) The module ™ is the internal direct sum of the submodules 
RX ,.4.,RXp. 


In each of the following show that the rows of P are a basis for 


R”. 
(a) R=Z, n=2, : _ 
p= 
5 7 
(b) R=Z, n=3, l 0 —-]1 
p= [ 7 3 
0 2 l 
(c) R=Z,3, n=3, 3 l 7 
p= E 5 l 
4 0 3 
(4) R=Z,[X],n=3, l l x 
P= |X 1+X 1+X? 
1 14x 1] 


Let P, be the set of polynomials in R[X] of the form dy +a,X + 
.. ta, X". Show that1,X,...,X” isabasis of P, as an R-module. 


260 MODULES 


Define ¥" to be X(X — 1)... (X —i + 1), the ith factorial 
power of X. Show that 1, X¥"?,..., X is a basis for P, . Take 
n = 3 and compute the transition matrices from 1, X, X*, X? to 
1X9) x) xX) and from 1, X9), xX) x to 1, X, X2, 
Xs. 

14 Suppose P is the transition matrix from the basis x,,...,X, Of 
M to the basis y,, ..., yy, and Q is the transition matrix from 
Yi,+++,J)n to the basis z,,...,2Z,. What is the transition matrix 
from x,,...,Xp, tOZ,,.-.,2Zn? 

15 Let P be the matrix in Exercise 12b. Find the transition matrix 
from the rows of P to the rows of the transpose P’. 


16 Let U be the set of vectors (a, b, c) in Z* such that 2a — 3b + 
5c = 0. Show that U is a Z-submodule of Z? and find a two-element 
generating set for U. 


3. ENDOMORPHISM RINGS 


Let RK be a ring. In this section we will study the collection of all R-homo- 
morphisms from one R-module to another. Of particular interest will be the 
structure of the set of all R-homomorphisms of a given R-module into itself. 

Let M and N be left R-modules. The set of all R-homomorphisms of 
M into N is denoted Homp (M, N). This set is always nonempty. (Why?) 
We can define a binary operation + on Homey (M, N) by x(f + g) =xf + xg 
for all f, g in Homer (M, N) and all x in M. Clearly, f+ g is a map from M to 
N, but we must check that f+ g is an R-homomorphism. If x, y are in M and 
risin Rk, then 


xty)(ftg=aty)ftaty)eg 
=xft+yftxg + yg 
=xftxgtyftyg=x(fte)tyvftgs), 


and 


(rx) (f+ 8) = (rx) f t+ (rx )g =r(xf) + (xg) 
= r(xf + xg) =r[x(fteg)]. 


Therefore f+ g is in fact an element of Homp (M, N). We can also try to de- 
fine a multiplication of elements in Homp (M, N) by elements of R using 
the formula x(rf) = r(xf). However, if R is not commutative, then the ob- 
ject rf is usually not in Homer (VY, N). If x, y are in M, then 


ENDOMORPHISM RINGS 261 


X+y) Cf) =rlh& +y)F) =raftyf) 
=r(xf) +rOf) =x(rf) + yf). 
Thus rf is a homomorphism of abelian groups. However, if s is in R, then 
(sx) *f) =rL(sx)f] =ris@f)] = 7s) @f). 
But if rf is to be an R-homomorphism, then (sx) (rf) should be 
s[x(rf)] = s[r(xf)] = (sr) Of). 


Since (rs) (xf) and (sr) (xf) will not generally be equal unless R is com- 
mutative, it follows that rf will normally be in Homp(M, N) only when 
R is commutative. 


THEOREM 1. Suppose that R is a ring and that M and WN are left 
R-modules. The set Homer (M, N) is an abelian group under addition. If 
R is commutative, then Homer (M, N) is an R-module. 


Proof. The axioms for an abelian group are easily checked. The ad- 
ditive identity element of Homer (M, N) is the map x | 0, and the additive 
inverse Of an R-homomorphism f is the map —f defined by x(—f) = —(f). 
Suppose that R is commutative and that r, s are in R. Then, for any f, g in 
Hompr (M, N) and any x in M, we have 


x[r(ft+g)] =r[x(f+8)] = rxf + xg) 
=W(xf) + r(xg) =x(rf) + x(rg) 
=x(rf + rg). 
Therefore r(f +g) =rft+rg. Similarly, (7+ s)f=rf+sf. Also, 
x[r(sf)] = rix(sf)] = ris@f)] 
= (rs) (xf) = xs) f]- 
Hence r(sf) = (rs)f, and Homer (M, N) isan R-module. O 


If M is a left R-module, then an R-homomorphism of M into itself 
is called an R-endomorphism of M. The set of all R-endomorphisms of 
M is Homer (M, M), which is usually denoted Endpr (M). Besides the oper- 
ations of addition defined previously, there is another binary operation on 
Endr (M)—composition. 


THEOREM 2. Let M be a left module over the ring R. Under the oper- 
ations of addition and composition, Endr (M) is a ring. 


Proof. By Theorem 1, Ender () is an abelian group under addition. 
Composition of functions on a set is always associative, and so Endp (M) is 
a semigroup under multiplication. The identity function on M is an R- 


262 MODULES 


endomorphism and, hence, Endep(M/) has a multiplicative identity. All 
that remains is to check the distributive laws. If f, g, and # are in Endr (M) 
and x isin M, then 


x[f(g th)] =(xf) @ +h) = fog + Oxf ph 
= x(fg) +x (fh) =x (fg + fh). 
Therefore f(g + h) = fg + fh. Similarly, (f + g)h = fh + gh and thus Endp (M) 
isaring. 0 


If R is commutative, then Endp (M) is an R-module by Theorem | and 
a ring by Theorem 2. An R-algebra is a ring S that is also an R-module 
such that for ally in R and all a, b in S we have r(ab) = (ra)b = a(rb). 


THEOREM 3. If / is a left R-module over a commutative ring R, then 
Endpr (M) is an R-algebra. 


Proof. Let f, g be in Endp (M) and let r be in R. Then, for any x inM, 
x [(rf)g] = [x@f)lg = [r(x] 8 
=r[(xf)g] =r[x(fg)] 
=xI[r(fg)]. 


Thus (rf)g = r(fg). Similarly, f(g) = r(fg), and so Endp (M) is an R-alge- 
bra. OU 


When R is a commutative ring, there are many examples of R-algebras 
besides the endomorphism rings of R-modules. These include R[LX], M,(R), 
and R itself. The next section will discuss algebras in greater detail. 

The main goal of this section is to provide a description of Homr (M, N) 
when M and JN are finitely generated free modules. Any such module is 
isomorphic to R™ for some integer m, and so we begin with Hompr (R” , R"). 

LEMMA 4. Let A be an m-by-n matrix over R. The map taking u in 
R™ to uA is in Home (R™ , R"”). 

Proof. Suppose u and vy are in R™ andr is in R. Then uA = ut.x A is 
in R”, and so ul uA is a map from R™ to R”. Moreover, (u + v)A = 
uA + vA and (ru)A = r(uA). Therefore ul --uA is anR-homomorphism. 0 

LEMMA 5. If f:R™—R” is an R-homomorphism, then there is an 
m-by-n matrix A over R such that uf =uA for alluinR”™. 


Proof. Let x,,...,Xm be the standard basis of R™ and let A be the 
m-by-n matrix whose ith row is the vector x;f in R”. If u = (r,, ...,1%m) 
isin R™ , then 


uf = (>. rx: f= >, r(x; f) = > r,Ali;] =uA. U 


ENDOMORPHISM RINGS 263 


The map f:Z°—>Z® given by (x, y, z)f = (Qx —y,x +z, 3x +y — 2) 
is easily seen to be a Z-homomorphism. Since 
(1,0, 0) f= (2, 1,3), 
(0, 1,0)f=(—1, 0, 1), 
(0,0, 1)f= (0, 1, —1), 
(x,y, Z)f is (x, y,x)+.xA, where A is the matrix 


2 l 3 
—| 0 l 
0 1 -l 


Lemmas 4 and 5 may be combined to give the following important 
result. 


THEOREM 6. If R is a ring, then Homer (R™ , R”) is isomorphic as an 
abelian group to the set M@ of m-by-n matrices over R. If R is commuta- 
tive, then Home (R™ , R” ) and M are isomorphic as R-modules. 

Proof. Let x,,...,Xm be the standard basis of R™. For f in 
Homer (R™, R”), let fo be the m-by-n matrix A whose ith row is x;f. By 
Lemma 5, uf = uA for all u in R™. Suppose g is also in Homer (R”, R") 
and B = go. Then 

xi(ftg)=xift+xig=Ali;] +Bli;] =(4+B)[i;]. 

Therefore (f + g)o =A + B= fo + go, and o isa homomorphism of abelian 
groups. If fo is the zero matrix, then f maps each of the x; to zero in R”. 
Thus f maps every linear combination of the x; to zero, so f is the zero 
homomorphism. Therefore the kernel of o is trivial and o is injective. If 
A is any m-by-n matrix over R, then by Lemma 4 the map f:ub-wA is in 
Homey (R™, R”). Since x;f =x;A = A[i;], we have fo = A and o is surjec- 
tive. Thus o is an isomorphism of abelian groups. If R is commutative, 
then, for any in R and any fin Home (R™, R”), the map ef is in 


Homp (R™, R”) 
and 


xf) = rif) =rAli;] = CAD), 
so (rf)o =rA =r(fo). Therefore o isan R-homomorphism. 0 


COROLLARY 7. The rings Endp (R” ) and M,(R) are isomorphic. If 
R is commutative, then Endp (R” ) and M,(R) are isomorphic as R-algebras. 


Proof. Let o be the map from Endp (R” ) = Home (R", R”) to M, (R) 
defined in the proof of Theorem 6. Then o is an isomorphism of abelian 


264 MODULES 


groups and, if RK is commutative, an isomorphism of R-modules. All that 
remains is to show that (fg)o = (fo) +.x(go) for all fand g in Endep (R” ) and 
that o maps 1 in Endp (R"”) to 1 in M,(R). Let A = fo and B = go. Then 


xj(fg) = @if)g = (;A)B 
= x; (AB) = (AB)[i5]. 


Therefore (fg)o = AB. Finally, if f is the identity endomorphism of R”, 
then x; f = x; =J[i;], where J is the n-by-n identity matrix. Thus fo = / and 
o isaring isomorphism. U 


Suppose M and NWN are finitely generated free modules over R. Thus 
M = R™ and N = R" for some integers m and n. It is easy to show (see 
Exercise 3) that Homer (M, N) = Homey (R™, R") and so, by Theorem 6, 
the R-homomorphisms of M into N may be put in 1—1 correspondence 
with m-by-m matrices over R, It is important to make this correspondence 
explicit. Let x,, ...,Xm bea basis of Mand y,,..., yy a basis of N. 
Suppose fis in Hompr (M, N). Then the equations 

n 


x;f = >, Alisi)y;, l<i<m, 


uniquely determine an m-by-n matrix A over R. Theith row of A is the co- 
ordinate vector of x;f with respect to y,,...,¥n,. Wecall A the matrix of f 
with respect to the basesx,,...,Xm andy,,...,Vn- 


LEMMA 8. If u in R”™ is the coordinate vector of an element x in M 
with respect to x,, ...,Xm, then wA is the coordinate vector of xf with 
respect to y,,..+,)n- 


Proof. Suppose u = (r,,...,%m). Then x =7r,x, +... t¢%mXm and 


xf= (> rx; ) f= >, rj (if) 


> rj 2. A[isjl y; 
%( rAlivl) » 


and 
>, r;A[i;j] 


is the jth component of vA. O 
If M = N so that f is in Endp (M), then it is standard practice to take 


ENDOMORPHISM RINGS 265 


Viz +++,)n to be X,,...,X, and to speak of A as the matrix of f with 
respect t0OX,,...,Xm.- 

The matrix of an element of Hompr (M, N) depends on the choice of 
bases, and we need to know now the matrix changes when we change the 
bases in M and N. In Section 2 we made the assumption that R is commu- 
tative when discussing change of basis in free modules, and we will make the 
same assumption here. 


THEOREM 9. Let R be a commutative ring and let M and N be R- 
modules with bases x,,...,Xm and y,,...,Yn, respectively. Suppose 
P is the transition matrix from x,, ...,Xm to another basisx;,...,Xm of 
M and suppose Q is the transition matrix from y,, ...,y, to another basis 
Vi,-++,y, of N. If fis in Home (M, N) and A is the matrix of f with respect 
toxX,,...,Xm andy,,...,¥n, then PAQ™ is the matrix of f with respect 
tOxX1,--+,Xm and y,,...,Vy- 


Proof. Let x be in M and let u be the coordinate vector of x with 


respect to x}, .- .,Xm. By Lemma 2.10, the coordinate vector of x with 
respect to x,,...,Xm is UP. By the definition of A, the coordinate vector 
of xf with respect to y,,..., Yn is (uP)A. Again by Lemma 2.10, the 


coordinate vector of xf with respect to y},...,¥, is (uP)A)Q™? = 

u(PAQ™!). Thus PAQ™ is the matrix of f with respect to x,...,Xm and 
, , 

Vis se @ 9 Yn e a 


COROLLARY 10. If A is the matrix of an element f in Endrp (MY) 


with respect to x,, ...,Xm, then PAP™! is the matrix of f with respect 
tOx),..-,Xm- 


Proof. In Theorem 9 we have Q=P. [J 


Let us consider an example of the use of Theorem 9. Let A be the 
matrix 


O<+A+3 401 2053 141214 4141 3 


and let f be the homomorphism from Z? to Z* taking U to U+.xA. For 
example, if 


U+3 2 1 
then the image of U under fis 


_ _ Ut.xA 
7 3 3 16 


266 MODULES 


Then A is the matrix of f with respect to the standard bases of Z* and Z*. 
Now let 


U<P<3 396 5 2105 3 3 2 1 


6 5S 2 
10 5S 3 
3 2 1 


The determinant of P is —1. 


ZDET P 


1 


Thus, by Theorem 2.12, the rows of P are a basis for Z? and P is the transi- 
tion matrix from the standard basis of Z? to this new basis. Similarly, if 


Oo<@<4 uoG6 HO 1014141132014 14 14 «2121 
1 
0 
1 


then the calculation 


LDET Q 
1 


shows that the rows of Q are a basis of Z*. By Theorem 9, the matrix of 
f with respect to the rows of P and the rows of @ is 


[]1+B<+P+.xAt+.xZMATINV @Q 
u5  §23 74 35 
75 4O 124 53 
22 11 37 16 


As a check, let 


U+3 4 6 
[]J<+V<U+.xP 
“uo 7 0 


Then U is the coordinate vector for V with respect to the rows of P. Now 


the image of V under f is 


O<W<V+.xA 
17 114 ~13 


ENDOMORPHISM RINGS 26/7 


and, according to Theorem 9, the coordinate vector of W with respect to 
the rows of Q should be 


_ U<+X<U+.xB 
25 25 52 11 


This means that W should equal X+. xQ 


tf 


X+.xQ 
17114 7°13 


which, indeed, it does. 


EXERCISES 


1 Fill in the details in the proof of Theorem 1. 

2 What are the orders of the following abelian groups? 
(a) Homz(Z,, Zs). 

(b) Homz(Z,, Z,4). 
(c) Homz(Z, X Z,, Z4 X Zq). 
(d) Homz(Z¢, Zs). 

3 Suppose M,,M,,N,,N, are R-modules with M, = M, andN, = 
N,. Show that Homer (V,, N,) and Homer (M,, N, ) are isomorphic 
as abelian groups and, if R is commutative, as R-modules. Prove 
also that Ende (MV, ) and Endp (M, ) are isomorphic rings. 

4 Let A be an abelian group. Show that the group of units in the ring 
Endz(A) is Aut(A). 

5 Let R be a commutative ring. Prove that R, R[X], and M, (R) are 
all R-algebras. 

6 Let M be an R-module that is nontrivial and simple in the sense that 
the only submodules of M are {0} and M. Show that Endp (M) is 
a division ring. 

7 Let A be an m-by-n matrix over R and let f be the element of 
Homer (R™” , R” ) taking u to uA. Show that the image of f is Sp (A). 

8 Let P be the matrix 


2 4 3 
5 l 
3 0 2 


in M,;(Z,,). Show that the rows of P are a basis for M = (Z,, )°. Let 
f:M—M be the map taking (x, y,z) to (VV +z, 0,x — z). Show that 


268 


10 


11 


MODULES 


fis in Endz(M). What is the matrix of f with respect to the standard 
basis of M? What is the matrix of f with respect to the rows of P? 


Let f:Q3—-Q* take (x, y, z) to 
3 ree 
(y — 7 20x + 4y,5¥ +32,x — Zz). 

Show that f is in Homg(Q*, Q*) and determine the matrix of f 
with respect to the standard bases of Q? and Q*. 

Let R be a commutative ring, N the set of positive integers, and 
M the set of functions f:N—>R such that {n e Ni f(n) + O} is 
finite. Show that M is an R-module. For each i in N, let f; be the 
element in M taking i to 1 andj to O for all j # 7% Show that 


{fil|ieN} is a basis for M. Prove that N = M ® M is isomorphic to 
M as an R-module. 


Let R, M, and N be as in Exercise 10 and let 6:N—>M be a fixed 
R-isomorphism. Define maps a, £, y, 5 from N to WN as follows. 
Foru=(x,y)inJN, 

ua = (0, ué@), 

up = (u0, 0), 

uy=yd", 

ub =x9-. 


Prove that a, 8, y, 6 are in S = Endp (N) and that in M,(S) 
a QO y 64 l 0 
B 60 0 O O lf 
Show also that (y, 5) is a basis for S @ S. Conclude that S @ S$ 


is isomorphic to S as a left S-module and hence that S” is iso- 
morphic to S for alln > 1. 


4. ALGEBRAS 


Throughout this section RK will be a commutative ring. In Section 3 we 
defined an R-algebra to be a ring S that is also an R-module and in which 
r(ab) = (ra)b = a(rb) for all r in R and alla and b in S. Among the examples 
of R-algebras we have seen are R, R[X], M,(R), and Endep (M), where 
M is any R-module. We will construct additional examples in this section. 
Let S be an R-algebra and suppose / is an ideal of S. Then J is also 
an R-submodule of S. For, if r is in R and a is in J, then ra = (rl )a is in I be- 
cause / is an ideal. It follows that S// is a ring because J is an ideal of S and 


ALGEBRAS 269 


S/T is an R-module because J is a submodule of S. Moreover, for all rin R 
and all a and b in S, we have 
r[(it+a)d+b)) =r[J+ab] =I1+r(ab) 

= [rd +a)] U+)) 

=U +a) [rd t+ b)] 
and so S// is an R-algebra, the quotient algebra of S modulo I. The natural 
map from S to S/I is an algebra homomorphism, that is, it is compatible 
with both the ring and the module structures. 

As an example of a quotient algebra, we may take S = R[X] and 

[=< f >, the set of all multiples in S of some polynomial f Then J is an 
ideal and S// is an R-algebra. 


THEOREM 1. Suppose f in R[X] has positive degree n and a leading 
coefficient that is a unit in R. Then, as an R-module, R[X]/< f > is free 
of rank n. 


Proof. Let U be the subset of R[X] consisting of the polynomials of 
degree at most n — 1, including the zero polynomial. Then U is an R-sub- 
module of R[X] that is free with basis 1, X¥,...,X”"~*. We will show that 
under the natural map from R[X] to R[X]/< f> the submodule U is map- 
ped isomorphically onto R[X]/< f >. Suppose g is in R[X]. Since the 
leading coefficient of fis a unit in R, we can perform a division of g by f and 
obtain polynomials g and r such that g =gf+randr isin U. Then<f>t 
g=<f>tr,and U is mapped onto R[X]/<f >. Now suppose some nonzero 
element h of U is mapped to 0 in R[X]/<f >. Then A is in < f > and is 
divisible by f. Therefore h = fu for some nonzero u in R[X]. But, again, 
because the leading coefficient of fis a unit in RK, 

n — | > deg(h) = deg (fu) = deg(/) + deg(u) 
=n+deg(u) >n, 
which is a contradiction. Thus R[X]/< f > is isomorphic as an R-module 
to U and so is free of rank n. OU 


Suppose f is the polynomial 3 — X + 2X? — X? in Z[X]. Then, by 
Theorem 1, the algebra Z[X]/< f > is a free Z-module with basis << f>+ 1, 
<f>t+X, <f>t+Xx?. 

Algebras of the type discussed in Theorem 1 are of interest when we 
are trying to factor polynomials. The reason is given by the following the- 
orem and its corollary. 


THEOREM 2. Let S be a commutative ring and let a be a nonzero non- 
unit in S. Then a is prime if and only if S/<a > is an integral domain. 


Proof. Recall that a is prime if, whenever a divides a product then it 
divides one of the factors. For s in S let s denote the image of s in S/ <a>. 


270 MODULES 


First, suppose that S/< a > is an integral domain and that a divides bc, 
where b and c are in S. Then, in S/<a >, we have 0 = bc = be. Since S/<a> 
is an integral domain, either b = O or c = O. That is, either b is in <a> 
or c is in <a >. Therefore a is prime. Now suppose a is prime. Since a is 
not a unit, <a > #S and 1 #0 in S/<a>. If bc = 0 in S/<a>, then 
a divides bc, so a divides either b or c, Therefore b = Oorc =O and S/<a> 
isan integral domain. 0 


COROLLARY 3. Let R be a UFD and let f be a polynomial of posi- 
tive degree in R[X]. Then f is irreducible in R[X] if and only if R[X]/<f> 
is an integral domain. 


Proof. Since f has positive degree, f is a nonzero nonunit in R[X]. 
By Theorem 4.11.1, R[X] is a UFD and by Theorems 4.10.4 and 4.10.6 
the concepts of primality and irreducibility are the same in UFD’s. Thus 
the corollary follows immediately from Theorem 2. U 


Corollary 3 can be rephrased as follows. Suppose we can find two 
polynomials g and h in R[X] such that f divides gh but f does not divide 
either g or h. Then either gcd(f, g) or gcd(f, /) is a nontrivial factor of f 

If R is a field, we can strengthen Corollary 3. 


COROLLARY 4. Let F be a field and let f in F[X] have positive 
degree. Then f is irreducible if and only if F[LX]/<f > is a field. 


Proof. If S = F[X]/< f > is a field, then S is certainly an integral 
domain and f is irreducible by Corollary 3. Now suppose f is irreducible and 
let g be a polynomial in F[X] representing a nonzero element of S. We may 
assume that deg(g) < deg(f). Let # be the monic gcd of fandg. Then deg(h) < 
deg(g) and h divides f Therefore, since f is irreducible, A must be 1. Thus, 
since F[X] is a PID, there exist u and v in F[X] such that 1 = uf + vg. 
The element of S represented by v is an inverse for the element represented 
by g. Hence S is a field. U 


The procedures ZVXIRRED or ZNXFACTOR may be used to find irre- 
ducible elements in Z,[X], p a prime. Corollary 4 then shows how to con- 
struct finite fields that do not have prime cardinality. For example, the 
calculation 

N<3 
ZNXIRRED 2 


DMMP MRO 
MWRPORRRB 
PREROOO 


ALGEBRAS 271 


determines the monic irreducible polynomials in Z,[X] of degree at most 
2. The first quadratic one is f = 1 + X?. By Corollary 4, S = Z3[X]/<f> is 
a field and, by Theorem 1, we know that S is a free Z,-module of rank 2. 
Therefore | S| = |Z3 |? =9. 

CLASSLIB contains procedures for working in algebras R [X]/<f>, 
where f is monic and R is one of the rings Z, Z,, or R. We will illustrate 
these procedures with an example in which R is Z,; and fis2+4X +X? + 
X3, 

<8 
Fe2 4 1 14 


We can represent elements of S = Z,[X]/<f> by polynomials, remembering 
that two polynomials g and / represent the same element of S if and only 
if f divides g — h or, equivalently, if and only if g and h give the same re- 
mainder when divided by f, The computations 


G+2 1 3 
H#<+3 4 3 1 3 
F ZNXREM G ZNXDIFF 8G 


F ZNXREM G 


F ZNXREM # 


verify in two different ways that 2+ X+3X? and3+4¥+3X?2 + X32 + 
3X* represent the same element of S. We can use ZWXSUM, ZNXDIFF, 
and ZVX PROD to perform the ring operations in S and reduce as necessary 
modulo fusing ZVXREM. For example, the calculation 


A+4¥ 2 3 
F ZNXREM 2 ZNXSUM (2xA) ZNXSUM A ZNXPROD A 
0 


shows that the element a of S represented by 4 + 2X + 3X? satisfies 2 + 
2a + a? = 0. The element of S represented by a polynomial g is a unit in 
S if and only if gcd(f, g) = 1. We can use ZNXGCD to try to compute in- 
verses in S. 


FF ZNXGCD A 


S 


F ZNXREM A ZNXPROD S 


272 MODULES 


Here we see that a is a unit and a™ is given by the global variable, S com- 
puted by ZVXGCD, which is 2+ 4X + X?. 

The process of reducing modulo f using ZVXREM is somewhat time 
consuming. If many calculations in S are to be performed, there is a more 
efficient way to reduce modulo f. Since f is cubic, every element of S can 
be represented by a polynomial of degree at most 2. The product of two 
polynomials of degree 2 has degree 4 and, if we always reduce modulo f 
after every multiplication, we need never work with polynomials of degree 
greater than 4. We start by making a matrix R whose rows give the remain- 
ders modulo f of 1, X, X*, X3, and X*. One way to do this is 


O<+P<(15)°.=15 L<R+F ZNXREM P 
100 0 QO 1 0 0 
O10 0 0 O 1 0 
O01 0 0 0 0 1 
O00 1 0 3 14 
O00 0 1 2 2 2 


Now, if G is a vector of length 5 representing a polynomial in Z,;[X] of 
degree at most 4, then 5|G+. xR is the same as F ZNXREM G, and the 
former takes less time to compute. 


G+1 2 3 2 1 F ZNXREM G 
5|G+.xkR 4 1 3 
4 1 3 


Note that the first three rows of Ff are the identity matrix and need not be 
computed. If we replace & by 


O<+RO<3 OR 
3 1 4 
2 2 2 
then ? ZNXREM Gcan be computed as follows. 


5|(34G)+C34G)+.xR0 
ud 3 


In CLASSLIB the procedures with the prefix ZVXF may be used to 
compute in Z, [X]/< f >, where f is monic. As usual, the value of n must 
be assigned to WV. In addition, the procedure ZNXFINIT must be used to 
initialize a matrix ZVXAT that is used like RO above to reduce modulo 
f. (The letters RT stand for remainder table. ) 


ALGEBRAS 273 


LZNXFINIT F 
LNXRT 

3 $1 4 

2 2 2 

Provided A and B represent arrays of polynomials in Z, [X] of degree less 


that the degree of f, then A ZNXFPROD B produces the same result as 
F ZNXREM A ZNXPROD B. 


A<1 2 3 

B<2 14 

F ZNXREM A ZGNXPROD B 
mt 

A ZNXFPROD B 
mt 


The procedures ZVXFSUM, ZNXFDIFF, ZNXFPOWER, and ZNXFINV 
may also be used for computations in Z,[X]|/< f >. Procedures with the 
prefixes ZXF and RXF perform computations in Z[X|/<f> and 
R{[.X]/< f>, respectively, where f is a monic polynomial. 

In Section 4.4 we defined, for any commutative ring R, evaluation 
maps of R[X] into R. These are the maps obtained by fixing an element 
a of R and mapping a polynomial f to its value f(a). We can extend this 
concept to maps from R[X] into any R-algebra. Suppose S is an R-algebra 
and a is an element of S. If f=cyo +ce,X+...+c,X" isin R[X], then 
Co tcyat... c,a" isa well-defined element of S, which we will call the 
value of f at a and denote f(a). It is easy to show that the map ft —/f(a) 
is an R-algebra homomorphism. As an example, let us take R = Z, S = 


M,(Z), 
2 —!l 
A= 
3] 
and f=4—2X +X. Then 
l 0) 2 —!1 3. —5 
f(A) =41 — 2A +A? =4 — 2 + 
0) ] ] 3 5 8 
_f3 -3 
3 6] 


We can define a map from R to any R-algebra S by mapping r in R to 
rl, where 1 is the multiplicative identity of S. This map is also an algebra 
homomorphism. If it is an embedding, that is, an injective map, then we 
often identify r and rl and consider R to be a subalgebra of S. 


274 MODULES 
The remainder of this section will be devoted to another method 


for constructing R-algebras that are finitely generated free R-modules. 


THEOREM 5. Let S be an R-algebra with an R-basis x,,..-, Xn. 
Then the multiplication in S is determined by the module structure and the 
n* products x;x;,1 <i,j <n. 


Proof. If a and b are in S, we can find r,,...,/%, and S,,..., 5, in 
RK such thata=r,x, +...%X, andb=s,x, +...5,X,. Then the axioms 
for an R-algebra imply that 


wa(E nal) (Fo 
= 4 (r;, Xi) (S;x;) = > VS; (X;X;). 


Thus, if we know the structure of S as an R-module and are given the prod- 
ucts x;x;, we can compute ab. LU 


The most natural way to describe the products x;x; of Theorem 5 
is to express them as linear combinations of the basis elements, that is, by 
giving elements e;;, of R such that 


Xjx; = > CiikXk- 
If this is done, then the formula for multiplication in S is 


(nx) (Eom) = Bn 


> VS; >, Cijk Xk (*) 
> ( % risyeik) Xk. 


k 


We cannot choose the e;;, arbitrarily and get an R-algebra. 


THEOREM 6. Let S be a free R-module with a basis x,,...,X, and 
for 1 <i,j,k <n, let ej, be in R. Then formula (*) defines a multiplication 
on S that satisfies both of the distributive laws and the property r(uy) = 
(ru)v = u(7y) for all r in R and all u, v in S, Moreover, the following are 
equivalent. 


ALGEBRAS 275 


(a) The multiplication is associative. 
(b) xj(xjx% ) = (xj x;)xx for alli,7, k. 
(c) For alli,j,k,m 


> Cik2 Cigm — > Ci72EQkm - 


Proof. Let u =7r,X, +... + Xn, V = 54x, +... +S, Xp, and w= 


tx, +...+t,x,, with all7;,5;,and ¢; in R. Then 


u(v +w)= (> rix;) (> (s; + Hix) 


> ri(s; + t;) (X;X;) 


> 1S; (xjx;) + > rit; ix;) 
=uv tuw, 
Similarly, we can verify by direct calculation that (u + v)w = uw + yw and 


r(uv) = (ru)v = u(rv) for all r in R. Clearly, condition (a) implies condition 
(b). To see that (b) implies (a), we compute 


u(vw) = (> rx;) ie a) (> tex) 


Fon) (Rac) 


My rjSjty (xiXjxx)] 


and 


uryw=|(% rx) (> 53x) (> te Xx) 
(3 ross) (Bas) 


>, rjSjte (xix; )xK). 


Thus, if x;Qcjx,) = (xjx;)xx for all i,7, kK, then we have u(yw) = (uv )w and 
the associative law holds. Now 


2/76 MODULES 
Xj(XjXK) = x; > Cg XQ 
= > CIKQXiXQ 
Q 
= > CK > Ciom Xm 
Q m 


- > ( S cneciam Xm 


m 
and 
(XjX;)XK = (> cin Xk 
= > CijgXQxk 
Q 
= > C772 > CgokmXm 
Q m 
= > (= CijQ caxm Xm - 
m Q 
Since X;,...,X, are a basis for S, we see that x;(xjxx,) = (xjx;)x, if and 
only if 


> Cjk2 &igm = > Cijngeakm 


for all m. Thus (b) and (c) are equivalent. 0 


The elements e;;, are called the structure constants for the multipli- 
cation defined by (*). 

For any commutative ring R there is an important R-algebra called the 
quaternion algebra over R. This is the algebra Q with an R-basis 1, i,j, 
whose elements multiply according to the following table. 


This table can be remembered easily if one notes that 1 is the multiplica- 


ALGEBRAS 277 


tive identity, i? = j? = k? = —1 and, if G@, b, c) is any cyclic permutation 
of G, j, k), then ab =c and ba = —c. It is standard practice to identify an 
element rin R withr] in S, thereby making R a subring of S. 

To verify, using Theorem 6, that the quaternion algebra Q is really an 
R-algebra we must show that multiplication is associative for triples of 
basis elements. There are 4° = 64 such triples, and checking each one by 
hand is somewhat tedious. The exercises describe alternative approaches 
to verifying associativity. 

One reason for the importance of quaternion algebras is given by the 
next theorem. 


THEOREM 7. The quaternion algebra over R is a noncommutative 
division ring. 

Proof. Let Q be the algebra of real quaternions. Then direct computa- 
tion shows that for all real numbers a, b,c, d 


(a+ bit cj + dk) (a— bi— ci—dk)=a* +b* +c? +d’. 


If a, b, c,d are not all 0, then a? + b? +c? +d? +40, and itis easily checked 
that 


a — bi— cj — dk 
a* +b? +c? +d? 
is a two-sided inverse for a + bi + cj + dk. Since ij # ji, the algebra Q is 
certainly noncommutative. 0 


The ring of real quaternions is our first example of a noncommutative 
division ring. 

Let us turn now to some examples that show how to compute in alge- 
bras defined by arrays of structure constants. Let S be a Z-algebra with basis 
X1,...,X, and structure constants e;;, relative to this basis. There is an iso- 
morphism of S as a Z-module onto Z” in which x; maps to the ith standard 
basis vector in Z”. This isomorphism allows us to use the structure con- 
stants to define an algebra structure on Z” that is isomorphic to that of 
S. Thus, without loss of generality, we may assume that S=Z” and 


X1,+..,X,y is the standard basis of Z”. Formula (*) now gives the product 
(u;, so ,Un) (1, s -,Yn)= (w,,. . -,Wn), 
where 
Wk a UjVCiTk - 
l,j 


Suppose the e;;, are given by an APL array £ of rank 3 and U and V are 
integer vectors of length n. Then the product of U and V in the algebra 
defined by Fis V+. xU+. XE. 


278 MODULES 


Suppose £ is the following rank 3 array. 


O«E<2 2 2p19 26 10 13 10 13 5 “6 
19 26 


10 13 
10 13 
5 6 


It is not hard to show (see Exercise 7a) that £ satisfies condition (c) of 
Theorem 6, and so £ defines an associative multiplication on Z?. The com- 


putation 


U+ 2 1 V+.xU+.xE 
V+3 4 ~144 197 


shows that the product of (—2, 1) and (3, 4) is (144, 197). 
It is not obvious that the multiplication defined by £ has an identity 


element. However, from 


I« 1 2 Y+.xI+.xE 
X<+1 0 0 1 
Y<O 1 I+.xY¥+.xE 
X+.xI+.XF O01 

1 0 
I+.xX+.xF 

1 O 


we can see that (—1, 2) acts as a two-sided identity on the standard basis of 
Z* , and it follows easily that (—1, 2) is the identity element for this multipli- 
cation. 

The procedures in CLASSLIB with prefix ZA can be used to com- 
pute in a Z-algebra defined by an array of structure constants. The pro- 
cedures assume that the structure constants are given by the array ZSC, 
which can be initialized with ZAI NIT. 


ZAINIT E 
_Z8C 
19 26 
10 13 
10 13 
5) CG 


Now we can use ZAPROD to compute products. 


ALGEBRAS 279 


U ZAPROD V 


“4144 197 


Addition is just ordinary addition of vectors but may be performed by 
using ZASUM if desired. 

The procedures with the prefixes ZVA and RA may be used to perform 
computations in algebras over Z, and R, respectively. 


EXERCISES 


1 


Let S = Z,[X]/< f >, where f= 6+ 6X + 4X? + 5X? + X*, and 
let u be the element of S represented by 2 + 3X? + 6X3. For 
which of the following polynomials g does g(u) = 0? 

(a) 5+6X+2X74+X3. 

(b) 1+5X+X?., 

(c) 1-—X*%. 

Let S and wu be as in Exercise 1. Show that the set J of polynomials 
g in Z,[X] such that g(u) = O is an ideal of Z,[X] and therefore 
consists of all multiples of a unique monic polynomial h. Find h. 


Make comparisons of the execution times for the statements 


F ZNXREM G ZNXPROD 8H 


and 


G ZNXFPROD A 


Here we assume ZVXFINIT F has already been executed and 
that 0G and oZ are at most the degree of F. 


Let S be as in Exercise 1 and let vy in S be represented by 1 + X?. 
Show that v is a unit in S and find y™. 


Let G be a finite group and let S be an R-module. Suppose for 
each g in G we have an element x, of S such that the map glx, 
is 1 — 1 and {x,|g € G} isa basis for S. For g,h in G define xgxp = 
Xgn. Show that condition (b) of Theorem 6 is satisfied and that 
the multiplication on S defined by formula (*) makes S into an 
R-algebra. Show that S is isomorphic to the group ring of G over 
R defined in Exercise 4.11.15. 


The matrices 


280 


10 


11 
12 


13 


14 
15 


16 


MODULES 


form an R-basis of M,(R). Determine the structure constants for 
M,(R) with respect to this basis. 

Let E be an n-by-n-by-n array of real numbers. Formulate the 
following as APL propositions: 

(a) Condition (c) of Theorem 6. 

(b) The commutativity of multiplication defined on R” by B£. 
(c) The vector J is a two-sided identity for the multiplication. 
Let a = 2 — 31 + j — 2k and b = —1 + 2i + j — 3k in the algebra 
of real quaternions. Compute a*, ab + ba, and a7. 

Construct the array ~ of structure constants for the quaternion 
algebras relative to the basis 1, i, j, k. Show that & satisfies the 
APL proposition of Exercise 7a. 

Let X = {+], +i, +j, +k} in the algebra Q of real quaternions. Show 
that X is closed under multiplication in Q and construct a binary 
operation table 7 for multiplication in X. Show that T is associa- 
tive and so in particular multiplication in Q is associative for triples 
of basis elements. Thus Q satisfies condition (b) of Theorem 6. 


What are the units in the integer quaternion algebra? 


Find the order of the group of units in the quaternion algebra 
over Z3. 

Let p be a prime. Show that a? =a for alla in Z,. Let S be a com- 
mutative Z,-algebra. Prove that the map x Lx? is an algebra 
homomorphism of S into itself. 

Is Z[X]/<2X?> finitely generated as a Z-module? 

Let S be the field Z3[X]/< f >, where f = 1 + X*. What is the 
multiplicative order of the element in S represented by X? Find 
an element of multiplicative order 8 in S. 

Suppose S is an R-algebra that is a finitely generated free R-module. 
Show that the map rF>rl of R into S is an algebra embedding of 
R into S. 


MODULES OVER 
EUCLIDEAN DOMAINS 


Among the objects traditionally studied in an introductory algebra course 
are finitely generated abelian groups, finite dimensional vector spaces, 
and pairs (V, f) consisting of a finite dimensional vector space V and a 
linear transformation f of V into itself. These three types of algebraic ob- 
jects correspond to modules over three different rings, the integers, a field, 
and the polynomial ring over a field. Since each of these rings is a Euclidean 
domain, it is possible to obtain most of the important results about the 
objects listed as corollaries of theorems about modules over Euclidean 
domains. This approach gives unity to the discussion and demonstrates 
the power of abstraction and the axiomatic approach in algebra. Through- 
out this chapter R will be a Euclidean domain with Euclidean norm N. 


1. ROW EQUIVALENCE 


In this section we begin a discussion that will provide further information 
about the submodules of R” and the units in M,(R). Initially we will use 
only the fact that R is a PID. Later the discussion will depend heavily 
on the fact that R is a Euclidean domain. We will draw on many results 
proved in Chapters 4 and 5. Of particular importance will be the row oper- 
ations over R defined in Section 4.7. 

Let n be a positive integer and let U be a submodule of R”. By The- 
orem 4.9.6, we know that R is a PID and so, by Corollary 5.2.16, it follows 
that U can be generated by n elements. Thus there is an n-by-n matrix 
A over R such that U = Sp(A). This fact is so important that we will re- 
view briefly the steps hy which it was obtained. For 1 <i < n let U; be 
the set of elements in U whose first i — 1 components are 0. Each U; is a 
submodule of U, and U= U, D...2D U,. Let S; be the set of ith com- 
ponents of the elements in U;. Then S; is an ideal of R and so has the form 
Ra; for some a; in S;. The element a; is not unique but, by Theorem 4.10.1, 
it is determined up to an associate. For each i let u; be an element of U; 
having a; as its ith component. Then U is generated by u,,...,u,. If we 
take A to be the matrix whose ith row is u;, then A has the form 


281 


282 MODULES OVER EUCLIDEAN DOMAINS 


Qy 


a4 9 


ay 


where <all the entries below the main diagonal are 0, and U = Spr(A). We 
will call a; the ith standard invariant of U. It must be remembered, how- 
ever, that the a; are determined only up to associates. The word “‘standard”’ 
calls attention to the fact that these invariants are relative to the standard 
basis of R” . 

If a; = 0, we can choose u; to be 0. Clearly, we may delete rows of 
A that are zero without changing Sp (A). If this is done, we obtain a matrix 
A, with U=Sp(A,), and A, has the form 


S1 


2 


0 . 
| Qj, 
where aj,,...,4, are the nonzero standard invariants of U and all entries 
below the “‘staircase”’ are zero. 
The preceding discussion leads naturally to the following definition. 


An m-by-n matrix B over R is in row-echelon form if there exist integers 
randj,,...,J, such that 


1. O<rsmandl <j, <j, <... <j, <n. 
2. Bli;) isO fori>r. 
3. If 1 <is<r, then Bli;yj;] #Oand B[i;j] =O forl<j< j,. 


Condition (3) states that B[i;j;] is the first nonzero entry in B[i;] and 
condition (1) states that these first nonzero entries occur in later columns 
as i increases. Thus B has basically the same form as A,, except that rows 
consisting of zeros are allowed, providing they come after all nonzero 
rows. We will call the entries B[i;j;], 1 < i < r, the comer entries of B. 


ROW EQUIVALENCE 283 


The matrix 
0 2 1 —3 2 4 
R= 0 O -3 4 ] 0 
0 O0 QO QO 1 —2 


0 0 0 0 0 90 
is in row-echelon form and its corner entries are 2, —3, and 1. 


THEOREM 1. Let U be a submodule of R” and let aj,,...,a;, be 
the nonzero standard invariants of U withj, <j, <... <j,. There exists 
an r-by-n matrix B such that 


(a) B is in row-echelon form. 

(b) The ith corner entry of B is aj, and occurs in column j;. 

(c) V=Sp(B). 

Proof. The matrix B is simply the preceding matrix A,. U 

The following important result is essentially the converse of Theorem 


THEOREM 2. Let B be an m-by-n matrix over R that is in row-echelon 
form with corner entries B[i;j;], 1<i<r. Set U = Sp (B). Then 


(a) The nonzero rows of B area basis for U. 
(b) If a,, . .., a, are the standard invariants of U, then a; = 0 if j is 
notin {/,,...,J,} and aj, is an associate of B[i;j;]. 


Proof. Clearly, the nonzero rows of B generate U so, to prove (a), we 
need only show that the first r rows of B are linearly independent over R. 
Suppose there exists elements c,,..., cy Of R such that c,B[1;]) +... + 
cyB[r;] = O and not all of the c; are 0. Let c; be the first nonzero coef- 
ficient. The j;th component of c;B[i;] + ...+c,Bl[r;] is 


c;B[i;j;) + Cj+] Blit 1, ji] +... + ¢,Blir;ji], 
which must be 0. But if kK >i, then B[X;j;] = 0, so we have 
cB li;j;] = 9. 


Since B[i;j;)] # O, this means c; = O, contradicting our choice of i. Thus 
the nonzero rows of B are a basis for U. 

Let U; be the submodule of U consisting of the elements of U whose 
first 7 — 1 components are 0. A slight modification of the argument in the 
preceding paragraph shows that U; is generated by the rows B[i;] for which 
ji; => j. For suppose c,B[1;] + ...+c,Blr;] is in U; and c; # 0 for some 
i with j; <j. Choose i as small as possible. Then, as before, 


284 MODULES OVER EUCLIDEAN DOMAINS 


C;Bli;j;) +... +c,-Blr;j,] =0 


and B[k;j;] = 0 for k >i. Thus c; = 0, which we assumed not to be the 
case. Now let S; be the set of jth components of the elements of U; and 
let i be the smallest index such that j; > j. Then U; is generated by B[i;],..., 
Bir;]), so S; is generated by Bli;j],..., Blr;j). But if k >i, thenj <j 
and B[k;j] = 0. Thus S; is generated by B[i;j]. Therefore a; and B[i;j] 
are associates. If j #j;, then a; = B[i;j] =O. O 


COROLLARY 3. Every submodule of R” is a free R-module. 
Proof. By Theorems 1 and 2a, every submodule of R” hasa basis. LU 
COROLLARY 4. Suppose A and B are matrices over R such that 


(a) Both A and B have n columns. 
(b) Both A and B are in row-echelon form. 
(c) Sp (A) = Sp (B). 


Then A and B have the same number r of nonzero rows and the ith corner 
entries of A and of B occur in the same column and are associates, 1 <i < 
r. 


Proof. Let U be Spr (A) and let a,, ..., a, be the standard invariants 
for U. By Theorem 2, the jth column of A contains a corner entry if and 
only if a; # 0. If a; #0, then the corner entry in the jth column is an as- 
sociate of a;. Since U = Spr (B) also, the corner entries A and B are associ- 
ates and occur in the same columns. 0 


If B is in row-echelon form, it is easy to decide whether a given ele- 
ment x of R” is in Sp (B). Consider the following example in Z*. Let 


1 -l 3 O 
B=|;0 2 4 —2 
0 0 O 3 


and x = (4, —10, 0, 9). Isx in U = Sz(B)? That is, can we find an integer 
vector c = (C;, C2, C3) Such that x is c+ .xB? Computing the first, second, 
and fourth components of c+, xB, we have 


C,= 4, 
—c, + 2c, = —10, 
—2¢c, +3c3= 9. 
Substituting c, = 4 in the second equation, we get —4+ 2c, =— 100rc, = 


—3. Substituting in the third equation, we obtain 6 + 3c3 = 9. Thus c; = 1 
and c = (4, —3, 1). We must still check that c+.xB is really x, since it is 


ROW EQUIVALENCE 285 


possible that the third components of c+.xB and x may not be the same. 
However, x is, in fact, (4, —3, 1)+.xB. 

If we use the same procedure to try to write y = (—3, 4, 1, —2) as 
a Z-linear combination (d,, d,,d3)+.xB of the rows of B, then we are led 
to the equations 


d, = —3, 
_d,+2d,= 4, 
2d, + 3d; = —2. 


Substituting d, = —3 in the second equation gives 3 + 2d, = 4 or 2d, = 1. 
This equation has no solutions in Z and so y is not in U. 

The fact that B is in row-echelon form was very important in making 
the preceding computations so easy. If 


5 —7 11 5 
A= 6 -4 22 —-2 
10 —10 30 3 


then the preceding method may be used to show that each row of A is in 
S7(B), and so Sz(A) C S7(B). It is actually the case that S7(A) = Sz(B), 
but it is not obvious how to write the rows of B as Z-linear combinations 
of the rows of A. By Theorem 1, there exists a matrix C in row-echelon 
form such that Sz(C) = Sz(A) but, so far, we do not have a procedure for 
calculating such a C. 

In Section 4.7 we introduced row operations and studied their effect 
on the determinant of a square matrix. We showed that a row operation 
O can be applied to an m-by-n matrix A by forming £+.xA, where E is 
the m-by-m elementary matrix corresponding to O. We also proved that 
elementary matrices are units in M,,(R). It turns out that row operations 
play a key role in our solution to the problem of finding, for any matrix 
A over R,a matrix C in row-echelon form such that Sp (C) = Sp (A). 

Let A and B be m-by-n matrices over R. We say that A and B are 
row equivalent over R if there exists a sequence of matrices 


A =Ao,A1,. . 


such that for 1 <i < ¢ the matrix A; is obtained from A;_, by an elemen- 


tary row operation over R. Thus A is row equivalent to B if and only if 
B can be obtained from A by a sequence of row operations. 


THEOREM 5. Row equivalence is an equivalence relation on the 
set of m-by-n matrices over R. 


286 MODULES OVER EUCLIDEAN DOMAINS 


Proof. If A is an m-by-n matrix over R, then A can be obtained from 
A by the empty sequence of row operations, so row equivalence is reflexive. 
If A =Ao, A,,...,A;=B8 is a sequence of matrices with A; obtained from 
A;_, by a row operation, then by Theorem 4.7.2 there exists a row oper- 
ation that converts A; back to A;_,. Thus, if A is row equivalent to B, 
then B is row equivalent to A, so row equivalence is symmetric. Finally, 
if A is row equivalent to B and B is row equivalent to C, it is obvious how 
to transform A into C using row operations. First, transform A into B; then 
transform B into C. Therefore row equivalence is transitive. L 


THEOREM 6. Two m-by-n matrices A and B over R are row equiv- 
alent if and only if there is an m-by-m matrix E such that E£ is a product of 
elementary matrices and B = E+, xA. 


Proof. Suppose A and B are row equivalent so that there are matrices 
A = Ao, A,,.-.,A; =8 such that for 1 <i < ¢t there is a row operation 
O; transforming A;_;, to A;. Let E; be the elementary matrix obtained by 
applying O; to the m-by-m identity matrix. Then, suppressing the symbol 
+.x, we have A; = £;A;_; and B= E,E,_, ...&,A =EA, where E = 
E,E,_ ,...E, isa product of elementary matrices. 

Conversely, suppose B = EA, where EF = E;F;_,... E, and each 
£; is an elementary matrix. Set Ag = A and A; = £;A;_;,1 <i <t. Then 
A; = B and each A; is obtained from A;_, by a row operation. Thus B is 
row equivalent to A. U 


COROLLARY 7. If two matrices A and B are row equivalent, then 
Sr (A) = Sp (B). Moreover, the rows of A are a basis for U = Sp(A) if and 
only if the rows of B area basis for U. 


Proof. Suppose A and B are row equivalent m-by-n matrices. By The- 
orem 6, there is a matrix EF = E,E;_;... E, such that each £; is an ele- 
mentary matrix and B = EA. Each &£; is a unit in M,,(R), and so E isa 
unit in M,,(R) and A = EB. By Theorem 5.2.6, Sp (A) = Sp (B). If the 
rows of A are a basis for U = Sp (A), then 


Blis)= & Elis Aus) 
j 
and so, by Lemma 5.2.11, the rows of B are a basis for U. By symmetry, it 
follows that if the rows of B area basis for U, then so are the rows of A. U 


So far we have really used only the fact that R is a PID. The next 
theorem uses our hypothesis that R has a Euclidean norm. 


THEOREM 8. Every matrix with entries in R is row equivalent over 
R toa matrix in row-echelon form. 


ROW EQUIVALENCE 287 


Proof. Let A be an m-by-n matrix over R. We will prove the theorem 
by describing a procedure for finding a sequence of row operations that 
transforms A into a matrix in row-echelon form. The procedure is recursive 
in the sense that it reduces the problem to finding a row-echelon form 
matrix row equivalent to a certain (m—1)-by-m matrix. Thus the proof of 
the validity of the procedure is by induction on m. 

If A = 0 or if m = 1, then A is in row-echelon form to begin with and 
we are done. Thus we may assume that m > 2 and A # O. Let j be the 
number of the first column of A containing a nonzero entry. Our first 
task is to show that using row operations we can change A so that it has 
the form 


where the entry * is nonzero and in the jth column. Let a = A[i;j] be an 
entry in A[;j] that is nonzero and for which the norm N(a@) is as small 
as possible. Assume for the moment that a divides every entry in A[;/]. 
Then, for each k with 1 < k < m and k #i, we can write A[k;j] = exa 
with ex in R. If, for each such k, we subtract e, A[i;] from A[X;], then 
we obtain a matrix in which the only nonzero entry in the jth column is 
a in the ith position. Interchanging rows 1 and i, we have a matrix of the 
required form. 

But suppose a does not divide every entry in A[;j]. Then, for some 
k #i, we can write A[k;j] as ga +r, where r #0 and Nr) < Na). If we 
subtract ga[i;] from A[k;], then the new entry in the jth position is 
r, which has norm less than N(a). At this point, we replace i by k and a by 
r and repeat the process. Either a divides every entry in the jth column or 
else a row operation can be found that produces a matrix with a nonzero 
entry in the jth column that has norm less than M(a). Since the values of 
N(a) are nonnegative integers, they cannot continue decreasing indefinitely. 
Eventually we will reach the situation in which a divides all the entries in 
the jth column and so obtain a matrix B of the required form that is row 
equivalent to A. 

Now let C be the (m — 1)-by-n matrix consisting of all the rows of 
B except the first (1 0¥8 in APL notation). By induction on m, we can find 
a sequence of row operations that transforms C into a matrix D in row- 
echelon form. Since row operations do not change columns that contain 
only zeros, D has the form 


p= |o|o'|, 


288 MODULES OVER EUCLIDEAN DOMAINS 


where D’ lias n — j columns and is in row-echelon form. We can consider 
the row operations that transform C to D to be acting on the last m — 1 
rows of B. The result of these operations is to transform B into 


a arn 
0 D' 


which is in row-echelon form. Thus we have found a matrix in row-echelon 
form that is row equivalent to A. UO 


The procedure presented in the proof of Theorem 8 is called row re- 
duction. We will illustrate it with a few examples. First, suppose 


O<A<3 495 7115 6 4 22 210 #10 30 3 
5 7 11 5 
6 4 22 2 
10 10 30 3 


so that A is one of the matrices that appeared in our earlier example. The 
entry in the first column with the smallest absolute value is 4L1;1], but 
it does not divide every entry of AL ; 1]. Performing a row operation, 


(JIO<1 
AL2;1<AL2;]-AL1i;] 
A 
5 7 11 5 
1 3 11 ¢°$7 
10 10 30 3 


we obtain a nonzero entry in the first column that has absolute value less 
that 5. The entry 4[2;1.] now divides all entries in AL ;1]. Three more 
row Operations 


AC1; ]<AL1;1]-5xAL2;3 ] 
AL3;]+AL3;1]-10xAL2;] 
AC1 23;1+AL2 1;] 
A 

1 3 11 #7 

0 22 44H 40 

0 40 80 £73 


yield a matrix with only one nonzero entry in the first column. Now we 
need only work with the submatrix A4L2 3; ]. Three additional row oper- 
ations are needed to fix up the second column. 


ROW EQUIVALENCE 289 


AL33;J1+AL33;1]-2xAL2;] 
A 

1 3 11 -7 

~22 44H 40 

0 mM 8 = 7 
AL23;J]<AL2;]+6xAL3; ] 


© 


3 
2 

0 m 8 7 
[ 


Now A is in row-echelon form. 
When R is a field, row reduction works much faster. For example, 
consider the matrix 


O+U+3 392 41652435 


FON 
Wom 
O1 ND 


to represent an element of M3(Z,,). The multiplicative inverse of 2 in 
Z1, is 6. Multiplying UL 1; ] by 6 and reducing modulo 11, 


UC1;1]<«11/6xvCL1;] 


we make UL1;1] equal to 1. It is now easy to complete the reduction 
of the first column. 


UL2 3:]«11/U0L2 33;]-6 4e.xU[1i;] 


U 
1 2 6 
O 4 10 
0 6 383 


Note the way in which several row operations of type 3 can be performed 
with one APL statement. We can finish the reduction of the entire matrix 
in a similar manner. 


290 MODULES OVER EUCLIDEAN DOMAINS 


UL2;]«11|]3xUC2;] 


U 
12 6 
Oo 1 8 
0 6 3 
U£L3;]<11/U03;)]-6xvUL2;1] 
U 
1 2 6 
Oo 1 8 
0 0 10 
UL3;]<11/10xUL3;3] 
U 
12 6 
01 8 
001 


Over a field we may always make the corner entries of a matrix in row- 
echelon form equal to 1, and this is normally done. 
As our last example, let us consider 


1 +X? 1+X | 
X— xX? 1+X+X? 
in M,(Z3[X]). Adding the first row to the second, we get 
1+ xX? 1+X 

| 1+X —-1—-X+xX? | 

Subtracting —1 +X times the second row from the first gives us 
—] X —X*— xX? 
lax yx 


Finally, adding 1 +X times the first tow to the second, we obtain 
[~ X—X*— xX? | 
O—1—X7+xX3~-— x4 , 


which is in row-echelon form. 


EXERCISES 
1 Let U and V be submodules of R” with U C V. Suppose that 
(a,,...,4,) and (b,,..., b,) are the standard invariants for 


U and V, respectively. Prove that b; divides a;, 1 < i < n. Show 
also that if a; and b; are associates for all i, then U= V. 


ROW EQUIVALENCE 291 


aun 


Show that there is no submodule V of Z? such that V contains 
U and V has standard invariants (3, 4). 


Compute the number of 2-by-3 matrices over Z, that are in row- 
echelon form. (The ring Z, is not an integral domain unless 7 is a 
prime, but the definition of row-echelon form makes sense for 
matrices over any ring.) 

For each of the following matrices A, determine the rank of U = 
S7 (A), a basis for U, and the standard invariants of U. 


(a) _ 2 9 | 
—5 4 ]- 


Let U = Sz(A), where 


(b) 6 -8 15 
A=|{| 10 -—-14 26 
—15 9 2] 


5 4 0 2 


considered as a matrix over R. Find the dimension of U = Sp (A) 


and a basis for U. 
34+ 5i —4+ 3] 
A= . 
34+i —] +3i 


Let R = Z[i] and 
Determine the standard invariants for Sp (A). 
Let R = Z,[X] and 


X+X? X+X? l 
A= X3 ] 1+X 
X? ] ] 


Determine the standard invariants for Sp (A). 
Let A be as in Exercise 4a. Is (3, 17) in Sz(A)? Is (—2, 8) in Sg(A)? 
Let A be as in Exercise 4b. Are (3, 0, 1) and (2, —1, 7) in Sz(A)? 


292 MODULES OVER EUCLIDEAN DOMAINS 


10 Let R and A be as in Exercise 5. Are (1, 1, 5, 5) and (3, 2, 4, 6) 
in Sp (A)? 

11 Let Rand 4A beas in Exercise 6. Is (2i, 1) in Sp (A)? 

12 Let R and A beas in Exercise 7. Is (X, X, 1 + X + X*) in Sp (A)? 


2. ROW EQUIVALENCE, CONTINUED 


In this section we will continue the investigations begun in Section 1. Among 
the important topics we will consider are the problem of determining a 
unique representative for each row-equivalence class of matrices over RK, a 
converse of Corollary 1.7, a proof that the group GL, (R) of units in M, (R) 
is generated by elementary matrices, and an algorithm for computing in- 
verses Of matrices. 

Let A be a matrix over R. By Theorem 1.8, there is a matrix B in row- 
echelon form that is row equivalent to A. Unfortunately, B is generally 
not unique. For example, the matrices 


oo} Go) eS 


are all in row-echelon form and are row equivalent over Z to one another. 
It is very useful to be able to add extra conditions on B in such a way that 
B becomes unique. The manner in which this is done depends on the ring 
R. The case in which R is a field is the easiest to handle. 

Let F be a field. A matrix B with entries in F is said to be row reduced 
over F if the following conditions are satisfied. 


1. B isin row-echelon form. 
2. Each corner entry of B is 1. 
3. If Bli;j] isa corner entry, then B[K;j] =0 for all k Fi. 


Condition (3) states that not only are entries below a corner entry equal to 
OQ but all of the entries above a corner entry are also 0. For example, the 


matrix 
1 0 2] 
A= 0 ] 2 


is row reduced over Q but neither 


nor 


ROW EQUIVALENCE, CONTINUED 293 


o- F | Pe 
Oo 1 2 


is row reduced over Q, since B[1;1] isnot 1 and C[1;2] is not 0. 


THEOREM 1. Let A be a matrix over a field F. Then A is row equiva- 
lent to exactly one matrix that is row reduced over F. 


Proof. By Theorem 1.8, there is a matrix B in row-echelon form that is 
row equivalent to A. As we have already noted, we can multiply the ith non- 
zero row of B by the reciprocal of its first nonzero entry B[i;j;] and so 
make all of the corner entries 1. To obtain a matrix that is row reduced 
over Ff’, we have only to change B so all of the entries above each corner 
| are zeros. There are no entries above the first corner entry B[1;/, ]. Sub- 
tracting B[1;j,]times B[2;] from B[1;] makes B[1;j,] zero. Now sub- 
tracting B[1;j3] times B[3;] from B[1;] and B[2;j;] times B[3;] from 
B[2;] makes B[1;j3;] and B[2;j,] zero without affecting the earlier col- 
umns. Continuing in this manner, we get a matrix that is row reduced over F’ 
and row equivalent to A. 

To prove uniqueness, suppose B and C are each row reduced over 
F and also row equivalent to each other. The corner entries of B and C 
occur in the same columnsj, <j, <... <j, and Sr(B) = Sr(C). Suppose 
B #C. Then, for some i, the rows B[i;] and C[i;] are different. If i >, 
then B[i;] = C[i;] = 0. Thus 1 <i <7. The entries B[i;j;] and C[i;j;] are 
both 1, while B[i,j,] = Clisj,] = 0 if kK #i, Thus, if x = Bli;] — Cli], 
then x is in Sr(B) and the j, th component of x is O, 1 < k <r. Since x is 
in S-(B), we can find e,,...,¢e, in F such that 


r 


X= > ex B[kK;]. 
k=1 


However, the only nonzero entry in B[;j,] is B[K;j,], so the j, th com- 
ponent of the sum on the right is e,. Therefore e, = O for all k and x =0. 
ThusB=C. 0 


The proof of Theorem 1 actually tells us how to find the row-reduced 
matrix row equivalent to a given matrix. For example, suppose F = Z,; and 


LD<+A<+B<+3 392 33314442 4 


FW ND 
NOR ®w 
FF WwW 


To transform B into a matrix row equivalent to A and row reduced over 
Z;, we proceed as follows. 


294 MODULES OVER EUCLIDEAN DOMAINS 


BL1;]<5|3xBL1;] 
B 
14 4 
314 
uy 
BL2 33;]<5|BL2 33;]-3 4eo.xB[1;] 
B 
14 4 
2 
01 3 
BL2;]<«5|4xBL2; ] 
B 
14 4 
3 
01 3 
BL1 33]<5/B0C1 33)]-4 1¢.xB[2;] 
B 
10 2 
01 3 
0 0 0 


ing 
NO 


ro) 
ing 


© 
jo 


Over the integers the situation is a little more complicated. Let A be 
an integer matrix. It is not always possible to find a matrix B that is row 
equivalent to A over Z and is also row reduced over Q. We say that B is 
row reduced over Z provided the following hold. 


1. Bis in row-echelon form. 


2. All corner entries of B are positive. 
3. If Bli;j] isa corner entry: and 1 < k <i, then O < B[k;j] < Bli;j]. 


For example, 


0 0 0 2 


is row reduced over Z. With this definition the analogue of Theorem 1 holds. 


THEOREM 2. Let A be an integer matrix. Then A is row equivalent 
over Z to exactly one matrix that is row reduced over Z. 


Proof. We can find a matrix B in row-echelon form row equivalent to A 
over Z. By multiplying rows of B by —1 if necessary, we may assume all the 
corner entries B[i;j;] are positive. Write B[1;j,] in the form aB[2;j,] +5, 
where a and b are integers and 0 < b < B[2;7,]. Subtracting a times B[2;] 
from B[1;], we make B[1;j,] equal to b and so O < B[1;j,] < B[2;j, ]. 


ROW EQUIVALENCE, CONTINUED 295 


Proceeding with the entries above each succeeding corner entry, we obtain 
a matrix B row equivalent to A and row reduced over Z. The uniqueness 
of B we leave to the reader. (See Exercise 1.) U 


We will also need a definition of “row reduced’’ for matrices of poly- 
nomials. Let F’ be a field. A matrix B with entries in F[X] is row reduced 
over FX] if the following conditions hold. 


1. B isin row-echelon form. 
2. The corner entries of B are monic polynomials. 


3. If Bli;j] is a corner entry and 1 < k <i, then either B[k;j] is O 
or the degree of B[X;j] is less than the degree of B[i;/]. 


The matrix 


1+ X 1—X 2X +X? 
0 24+X? 0 
0 0 3+X3 


is row reduced over Z; |X]. 


THEOREM 3. Let F be a field and let A be a matrix with entries in 
F[X]. Then A is row equivalent over F[X] to a unique matrix that is row 
reduced over FLX]. 


Proof. See Exercise 2. U 


Let A be a matrix over R. We know how to find a matrix B that is row 
equivalent to A and in row-echelon form. If R happens to be Z, a field F 
or the ring F[X], then we can choose B to be row reduced over R. However, 
we often need more. By Theorem 1.6, there is a matrix E that isa product 
of elementary matrices such that B = E+ ,.xA. We sometimes need to know 
one such matrix E, The procedures described in the proofs of Theorems 1.8, 
1, and 2 allow us to solve this problem. We know that FE = E,£,_1,...£;, is 
the product of the elementary matrices corresponding to the row operations 
that transform A into B. If we apply these same row operations to the 
m-by-m identity matrix J, where m is the number of rows of A, then the 
result is E+,xJ or E. Thus, to compute both B and E, we first set B = A 
and & = J, Then, as we apply row operations to B in order to row reduce 
it, we apply the same operations to E. In this way the equation B = E+,xA 
remains correct throughout the computation. Here is an example with 
R=Z. 


296 MODULES OVER EUCLIDEAN DOMAINS 


D<A<B<+2 205 4 3 3 D<F<(12)°.=12 
5 4 1 0 
3. 3 t O01 
BL1;]+BL13)-B£2;] BL13;1<Fl1;1-F02;] 
B EB 
21 1 41 
3.3 Oo 41 
BL2;J<BL2;J]-BL1;] BL2;1]<F02;)-F(1;31] 
B E 
21 1 1 
1 2 “1 2 
BL1i;J<BL1;]-2xBL2;] EL1;J<£C1;J-2xFL2;3] 
B E 
0 3 3 5 
1 2 “41 2 
O+B+BL2 13] _ N+E+EC2 13] 
1 2 1 2 
0 ~3 3.05 
BL2;J<-BL2;] EL2;J<-EL2; ] 
B E 
1 2 “41 2 
0 3 3° °~=«5 
B+.xA 
1 2 
0 3 


The matrix B is now reduced over Z and B is F+. xA. 

The workspace CLZASSLIB contains procedures for row reducing 
matrices over R, Z, Z,, R[X], and Z,[X]. In the case of Z, [X], the in- 
teger m must be a prime. For example, if A is an integer matrix, then 


ZROWREDUCE A 


is the unique matrix B row equivalent to A over Z and row reduced over 
Z. In addition, ZROWREDUCE computes a global variable & that is a product 
of elementary integer matrices such that B is R+.xA. Using the preceding 
example, we have 


A R 
5 4 “4 2 
3 3 3. ~=«S 
O<+B<ZROWREDUCE A R+.xA 
1 2 1 2 
0 3 0 3 


The other procedures for row reduction work in a similar manner. 
Row operations can be used to compute inverses in M, (R). However, 
before learning how, we must prove two theorems. 


ROW EQUIVALENCE, CONTINUED 297 


THEOREM 4. Let A and B be m-by-n matrices over R. Then A and 
B are row equivalent over R if and only if Sp (A) = Sp (B). 


Proof. If A and B are row equivalent, then by Corollary 1.7 we know 
that Sp (A) = Sp (B). Thus we have only to show the converse. 

Set U = Sp (A) and suppose Sr (B) = U also. We must show that A and 
B are row equivalent. By Theorems 1.5 and 1.8, we may assume that both 
A and B are in row-echelon form. From Corollary 1.4 it follows that A and 
B have the same number r of nonzero rows and the ith corner entries of A 
and B occur in the same column j; and are associates, 1 <i <r. Multiplying 
the rows of A by units in R if necessary, we may assume A[i;j;] = Bli;j;] 
for each i. If r= 0, then A = B = O. Thus we may assume r > |. The vector 
x = A[1;] — B[1;] is in U and has zeros for its first }; components. By the 
proof of Theorem 2, we know that x is in the submodule of R” spanned 
by B[2;],...,Bl[r;] . Thus we can find e,,...,¢e, in R such that 


r 


x= > e;,Bli;]. 


Now A[1;] = B[1;] + x and, if we add e;B[i;] to B[1;] for 2 <i <7, then 
the new B is row equivalent to the old and B[1;] is now equal to A[1;]. 
Let A’ and B’ be the matrices obtained by deleting the first rows of A and 
B, respectively. Then Sp(A’) = Sp (B’) = U’', where U’ is the submodule 
of U consisting of the elements whose first 7, components are 0. By in- 
duction on m, the matrices A’ and B’ are row equivalent and, hence, so 
areA andB. 0 


Now we can provide additional information about the units in M, (R). 


THEOREM 5. Let A be a matrix in M,,(R). Then the following are 
equivalent. 

(a) Aisa unitin M,(R). 

(b) det A isa unit in R. 

(c) Sp(A)=R”. 

(d) The rows of A are a basis for R”. 

(e) A is row equivalent to the n-by-n identity matrix. 

(f) A isa product of elementary matrices. 


Proof. By Theorem 5.2.12, conditions (a), (b), (c), and (d) are equiv- 
alent. Let J be the n-by-n identity matrix. Since Sp (J) = R”, the equiv- 
alence of (d) and(e) follows from Theorem 4. Thus all that remains is to 
show that (f) is equivalent to the others. Now elementary matrices are 
units, so (f) implies (a). Suppose (e) holds. Then there is a matrix E = 
E; E;_1, ... E, such that each £; is an elementary matrix and FA = J, 


298 MODULES OVER EUCLIDEAN DOMAINS 


Multiplying on the left by E', we gtA=E1=E7...E;'. By The 
orem 4.7.4, each E;"! is an elementary matrix, so (f) holds. 0 


Let A be a matrix in M,(R). Theorem 5 gives us a new method for 
deciding whether or not A is a unit and, if it is, for computing A. First, 
row reduce A to a matrix B in row-echelon form using the procedure of 
Theorem 1.8. If B has fewer than m nonzero rows or if one of the corner 
entries of B is not a unit in R, then B is not row equivalent to J and A is 
not a unit. However, if B has n nonzero rows and all corner entries are units 
in R, we may row reduce B to J using the procedure in the proof of The- 
orem 1. Applying to J the row operations used to obtain J from A, we get 
A}, 

For example, suppose R = Z and 


N<A+3 39113 32%1411°1%1 ° 6 


1 1 8 
3 2 411 
1 1 6 


Row reducing A using ZROWREDUCE, 


ZROWREDUCE A R+.xA 
1 0 0 100 
010 010 
001 001 

Rf A+.xR 
23 #9 5 1 0 0 
7 3 2 01 0 
5 2 1 001 


we see that A is row equivalent to the identity, so the inverse of A is the 
matrix & computed by ZROWREDUCE. This is the method used by 


ZMATINV. 
ZMATINY A 
23 "9 75 
_7 3 2 
5 2 41 


We close this section with a two-part problem that illustrates the kind 
of computations we can now carry out. Let 


ROW EQUIVALENCE, CONTINUED 299 


O+A+3 4p 13 53 311 1682 49 4 
~4 3 _ 5 3 
3 #411 «16 8 


2 4 9 4 
N<+B<+3 49 3 3 9 #15 1113 32 58 7 9 20 #39 
3 3  ° 9 15 
11 13 32 58 
7 9 20 39 
C+5 117 10 


First, we want to decide whether or not C is in Sz(A) and, if so, to find an 
integer vector X such that C is X+.xA. Second, we would like to know 
whether or not A and B are row equivalent over Z and, if so, to find an ele- 
ment £ of GL3(Z) such that B is E+ .xA. Row reducing A, 


O<D<+ZROWREDUCE A R 
1 14 3 1 3 0 2 
0 2 1 2 2 0 12 
0 0 0 8 5 1 1 


we can now use the method described in Section 1 to see if C isin Sz(D) = 
S7(A) and, if so, to write C as Y + .xD. The value of Y £1 ] must be 


ClL1j+DL13;1], 


or 5. Subtracting, 


_ O<C1<«C-5xDL1;3] 
O 4 2 5 
we see that Y[2] must be C1L2]+D[1;2], or —2. Subtracting again, 


1<C2<C1i- 2xDL2;] 
0009 


we find that C2 is 3xDL3;] and so YL3] is 3 and C is in S7(D). 


Y<5 2 3 
Y+.xD 
5 117 10 


However, we were asked to write C in the form X+.xA. We know that 
C is Y+.xD and D is R+.xA. Therefore C is Y+.xR+.xA, which is 
(Y+.x)+.xA. Thus we may take X to be Y+. xf. 


_  O+x+Y+.xR X+.xA 
26 3 11 5 1 17 10 


300 MODULES OVER EUCLIDEAN DOMAINS 


In order to decide whether A andB are row equivalent, we must row 
reduce B. Before doing so, we save the matrix Ff obtained before. 


P<f ft 

ZROWREDUCE B 5 4 4 
11 3 41 5 2 1 
0 2 1 2 “uh 3.~C«8 


Here we see that B is also row equivalent to D, so A andB are row equiv- 
alent. Also, D is R+.*B, so B is Q@+.xD, where @ is the inverse of &. Thus 
BisQ+.xP+.xAor (Q+.xP)+.xA. 


D+Q+ZMATINV R E+.xA 
3 oO 4 3 3 9 15 
11 1 15 11 13 32 58 
“7 4 10 “7 ~9 20 39 
[]+#<+Q+.xP 
~ 29 4 10 


110 15 _ 38 
73 10 25 


We have found £ such that B is #+.xA. Since both @ and P are units in 
M;(Z), it follows that £ is also a unit. 


EXERCISES 


1 Complete the proof of Theorem 2. 
2 Prove Theorem 3. 


3 For each of the following matrices A find a unit E in M@3(Z) such 
that £+,xA is row reduced over Z. Do the computations by 
hand and check your work using ZROWREDUCE. 


(a) 5 2 1 
A=|]6 -3 O 
2 1. 3 
(b) 3 1 2 
A= {|-2 OO -1 3 
5 4 -3 -2 
(c) 15 
A= | 21 


ROW EQUIVALENCE, CONTINUED 301 


For each of the following matrices A over Z, find an element 
E of GL3(Z,) such that E+,xA is row reduced over Z,. Do the 
computations by hand and chetk your work using ZVROWREDUCE. 


(a) 1 6 0 
A= |2 5 1 
3 2 = 5 
(b) 4 1 2 = «5 
A= |6 3 0 1 
2 5 3 4 


Let A be the following matrix over R = Z3[X]. 


1+X —-1+X 
1+X? XxX? 
Find a unit £ in M,(R) such that E+,xA is row reduced over 


R. Do the computations by hand and check your work using 
ZNXROWREDUCE., 


Let A be a square matrix over a Euclidean domain. Describe a pro- 
cedure for computing det A using row reduction. Use this pro- 
cedure to compute the determinant of the matrix 


9 ] 7 
2 5 4 
3 6 8 
over Z11. 
Compute the inverse of 
] 1 -—2 
A= 2 3 —5 
] QO -2 


in M;(Z) by row reducing A to the identity matrix J and applying 
the same row operations to J, Check your work using ZMATINY. 


Determine the rank of 


A= 2 4 -6 
18 -9 6 


modulo p for several small primes p. What is the set of primes 
modulo which A has rank 1? 


302 


9 


10 


*11 


12 


13 


*14 


15 


16 


MODULES OVER EUCLIDEAN DOMAINS 


Let C+ 16 35 23 52and 


O<+A+3 492 9 384 16 ° 7°32 5 0 
3 


Find an integer vector X such that Cis X+. A. 
Find a unit £ in M,(Z) such that B= E+ ,xA, where 


3 -iI 5 4 


A=|—2 6 1 O|, 


4 2 3 2 
—3 3 3 2 


B= |-—7 -29 -18 —-10 


5 21 14 8 
Find an appropriate definition of “row reduced over Z[i]’’ such 
that every matrix over Z[iZ] is row equivalent to exactly one row- 
reduced matrix. 


Let A be a matrix with rational entries and set m = dimg(SqQ(A)) 
and m = dimr (Sp (A)). Can it happen that m #n? 


Let A and B be row-equivalent matrices over R. Show that the 
gcd of the entries of A is the same (up to associates) as the gcd 
of the entries of B. 

Generalize Exercise 13 as follows. Let A be an m-by-n matrix 
over R and let k be an integer between 1 and the smaller of m 
and n. Set D;(A) equal to the gcd of the determinants of the 
k-by-k submatrices of A. Show that if B is row equivalent to A, 
then D;(B) = Dx (A). 

Show that (x,,...,X,) in R” is part of a basis for R” if and only 
if gcd(x,,...,Xy,) = 1. Hint. Row reduce then-by-1 matrix 


Xy 


Xn 


Find a unit in M,(Z) with (6, 10, 15) as its first row. 


VECTOR SPACES 303 


3. VECTOR SPACES 


At this point we have already spent a considerable amount of time studying 
finitely generated free R-modules, their submodules, and their endomorph- 
isms. In Sections 5.2 and 5.3 we began with very few assumptions about the 
ring R and later proved some theorems that required R to be a PID. In the 
first two sections of this chapter we obtained further results assuming that 
R is a Euclidean domain. Now we will specialize even further and assume 
that R is a field. Before doing so, however, it will be convenient to list some 
important facts that we have already established. 


THEOREM 1. Let R be a nontrivial commutative ring and let n be a 
positive integer. Then 


(a) Every basis of R” has n elements. 

(b) Elements u,,...,uU, of R” form a basis for R” if and only if 
they generate R” . 

(c) If fin Endp (R" ) is surjective, then f is injective. 

(d) If R is a PID, then every submodule of R” is a free module of 
rank at most n. 


Proof. This theorem is a restatement of Corollaries 9, 13, 14, and 16 
of Section 5.2. U 


If R is a Euclidean domain, then the row-reduction procedures of the 
previous two sections provide a means for answering inclusion and member- 
ship questions about submodules of R” from a knowledge of generating 
sets for those submodules. That is, given x, y,,...,), and Z,,...,2Zs5 in 
R", we can decide whether x isin M =< y,,...,y, >and whether M is 
contained in<z,,...,2Z5.>. 

Throughout this section F’ will be a field and, unless otherwise indi- 
cated, all vector spaces will be over F. By Theorem 5.2.5, we know that 
vector spaces are free modules, so the conclusions of Theorem 1 apply to 
any vector space of dimension n over F. 


THEOREM 2. Let v,,...,¥%m be linearly independent elements of the 
vector space V and suppose ¥y,41 isin V—<v,,...,¥m >. Theny,,..., 
Ym+ 1 are linearly independent. 


Proof. Suppose c,, ..., Cm+, are elements of F such that c,v, + 
eee + Cm +] Vm+1 = 0. If Cm+1 =O, 


] 
Vm +1 ~~ eT fea" +...+CmYm), 


contradicting our assumption that ¥,,4+, is notin <v,,...,Vm, >. Thus 


304 MODULES OVER EUCLIDEAN DOMAINS 


Cm+1 = O and c,v, +...+ CmVm = O. By the linear independence of 
Vi,-++,Y%m, this implies thatc,; =c, =...=Cm+, =0. U 

COROLLARY 3. Let V be a vector space of dimension and let 
Vize ee Vp be linearly independent elements of V. Then there exists ele- 


ments Vm+1,.+-,Yn Of V such that v,,...,v, is a basis for V. 


Proof. Let W, =< ¥1,...,%m >. If W,=V, then »,,..., Vm is 
a basis for V and m =n. If W, # V, choose v.41 in V — W, and set W, = 
<Vy,---,V¥m+1 >. If W, #V, choose vy +42 in V— W,. Continuing in this 
way, we must reach a basis for V. Otherwise, after n — m+ 1 steps, we will 
have produced n + | linearly independent elements of V, which do not 
exist by Theorem ld. 0 


COROLLARY 4. If V is a finite dimensional vector space and W is 
a subspace of V with dime W = dims V, then W = V. 


Proof. Let v,,...,¥, bea basis for W. Then n =dimrV. If WV, 
we can choose v,4, in V — W and obtain n + 1 linearly independent ele- 
ments of V, again contradicting Theorem ld. U 


COROLLARY 5. If V is a finite dimensional vector space, then any 
spanning set for V contains a subset that is a basis for V. 


Proof. Let m = dimrV and let X be a spanning set for V. Since no 
linearly independent subset of V contains more than 7 elements, we can find 
a linearly independent subset Y of X such that Y is not contained properly 
in any other linearly independent subset of X. SetW=< Y>. If W#YV, 
then X is not contained in W. Thus there exists an element x in X — W. By 
Theorem 2, Y U {x} is a linearly independent subset of X properly con- 
taining Y. Thus W = Vand Y isa basis for V. U 


COROLLARY 6. Let V be a vector space of dimension n and let 
Vi,-++,Yy, be elements of V. Then the following are equivalent. 

(a) ¥1,...,Y, span V. 

(b) v,,..-,%, are linearly independent. 

(c) v,,...,Y, form a basis for V. 


Proof. By definition, (c) implies (a) and (b). By Theorem 1a, (a) implies 
(c). Finally, if (b) holds, then Corollary 4 shows that <v,,...,¥%, >=/YV, 
and so (b) implies (c). U 


Theorem 2, from which Corollaries 3 to 6 follow, is not true in free 
modules over more general commutative rings. For example, in Z? the 
elements v, = (2, 0) and v, = (0, 2) are linearly independent and v; = (1, 0) 
is not in<v,,v, >. However, vy; — 2v3 = 0,80 v1, v2, v3 are not linearly 
independent over Z. | 


VECTOR SPACES 305 


Suppose we have elements v,,..., Ym in F#” that are linearly inde- 
pendent. Corollary 3 states that we can extend this sequence to a basis of 
F" . How is this done in practice? For example, if 


U<+A<+3 503 4 241421440213 1 3 «2 


342 4 1 
214 0 2 
1313 2 


then modulo 5 the rows of A are linearly independent, as the following 
computation shows. 


<5 
U<+B<+ZNROWREDUCE A 


1300 1 
0010 0 
0001 2 


How can we find two more elements in (Z,;)° that, together with the rows 
of A, form a basis for (Z, )>? If we let 


D<+C<+(15)0.=15 CL1 3 43; ]<B 
100 0 0 C 
O10 0 0 13 00 iii 
001 0 0 O 1 0 0 0 
000 1 0 0 01 0 0 
000 0 1 0001 2 
000 0 1 


then C is in row-echelon form and the rows of C are a basis for (Z; )° . Since 
A is row equivalent to B, it follows that C is row equivalent to D, where 


D<C 
Dti1 3 4;]<A 
D 

3 42 4 1 

O10 0 0 

2144 0 2 

131 3 2 

000 01 


Thus the rows of D form a basis for (Z; )> that includes the rows of A. This 
method may be used over any field # and with any value of n. That is, 
if A is an m-by-n matrix over Ff whose rows are linearly independent, we row 


306 MODULES OVER EUCLIDEAN DOMAINS 


reduce A to a matrix B in row-echelon form and let J be the set of integers 
j such that B[;7] contains a corner entry. Then the rows of A together with 
the standard basis vectors u; of F” withj in {1,...,mn}— J form a basis 
for F”. 

Corollary 5 suggests problems of the following type. Suppose A is an 
m-by-n matrix with entries in Ff. How can we select a subset of the rows 
of A that is a basis for S-(A)? For example, let 


N<+11 
N+A+4 402 138741690034 528 
2138 
741 6 
900 3 
u 5 2 8 


where A is considered to be an element of M,(Z,,), and set U = Sz ,,(4). 
The nonzero rows of 


ZNROWREDUCE A 


1 O O 4 
O 1 0 10 
O O 4a 4 
0 OO O 0 


form a basis for U. However, Corollary 5 says that we can also find a basis 
that consists of rows of A. One method for finding such a basis is to row 
reduce the first m rows for 2 < m < 4. We have already done the case m = 4. 
With m = 2, 3, we get 


ZNROWREDUCE 2 4tA 
1 0 0 4 
O13 0 

ZNROWREDUCE 3 4t+A 


© 
jo 
Ow O 
OOF 


Thus the first two rows of A are linearly independent and generate a sub- 
space that contains AL3;] but not AL4;]. Therefore the rows of 
AL1 2 43] form a basis for U. In the next section we will present an 
alternative solution to this problem that requires only one row reduction. 

Theorem 2 and its corollaries deal with subspaces of a finite dimen- 
sional vector space V. Now we turn our attention to quotient spaces and 
homomorphic images of V. 


VECTOR SPACES 307 


THEOREM 7. Let V be a finite dimensional vector space and let W be 
a subspace of V. Then 


dimrW + dimr(V/W) = dime V. 


Proof. Let n = dime V and m = dimrW and let y,,...,Vm be a basis 
for W. By Corollary 3, we can find v,,41,...,¥, in V such that y,,..., 7 
is a basis for V. For any v in V let v be the image of v under the natural map 
from V to V/W. Thus vy = W + v. Since = iS spanned by Vas ...,%y, it fol- 


lows that V/W | is spanned by v,,.. . Buty, =...=%m = 0, s0 V/W 
is spanned by Vy41,--+,)n- Suppose dn «1, ...,@y are “elements of F such 
that dm+1¥m4+1 t+... + an¥_, = 0. Let Ww =an41¥m41 +... + Gn). Then 


w=0, so w is in W. Thus we can find a,,...,@m in F such that 
W=41V, +... + amy. But then 


Q1Vi1 +... t+ am Vm — Am +1 Vm+1—. . — An Vy, = 0. 
Since the v; are linearly _independent, we must havea; =...=Qm =Qm+1 = 
=a, = 0. Therefore v,,+41,...,V, are linearly independent, so 
Vm +]lyeee, Vn 


is a basis for V/W. Hence dime (V/W) =n — m and 
dimp W + dimer (V/W)=m+(n—m)=n=dimeVv. U 


COROLLARY 8. Let V and W be vector spaces over F with V finite 
dimensional. If f: V—>W is a linear transformation with kernel K and image 
U, then 


dimpK + dime U = dime V. 


Proof. By the First Isomorphism Theorem, U is isomorphic to V/K. 
In particular, dime U = dimr(V/K). The corollary now follows immediately 
from Theorem 7. U 


COROLLARY 9. If fis a linear transformation of the finite dimensional 
vector space -V into itself, then f is surjective if and only if fis injective. 


Proof. Let K be the kernel of f and U the image of f. By Corollary 
9, we know that dimrXK = O if and only if dimrU = dimr V. But this says 
that K = {O}ifand only ifU=V. O 


Corollary 9 should be considered analogous to the fact that surjec- 
tivity and injectivity are equivalent for maps of a given finite set into itself. 

COROLLARY 10. If U and V are finite dimensional subspaces of a 
vector space, then 


dimr(U + V)+ dimer (U N V) = dime U + dime V. 


308 MODULES OVER EUCLIDEAN DOMAINS 
Proof. By the Second Isomorphism Theorem, (U + V)/V is isomorphic 
to U/(UN V). Therefore 
dimr [(U + V)/V] = dimer [U/(UN V)]. 

But, by Theorem 7, 

dimer [(U + V)/V] =dimy(U + V) — dime V, 

dime [U/(U N V)) = dime U — dimr(UN V). 
The result now follows immediately. 0 


In Section 5.3 we studied homomorphisms between free modules. 
Suppose U and V are vector spaces of dimension m and 7, respectively, 
over F. By Theorem 5.3.4, Homf(U, V) is isomorphic as a vector space to 
the set of m-by-n matrices over F. In particular, 


dimr(Hom-(U, V)) = mn. 


If u,,..., Um isa basis for U and vy,,...,v, is a basis for V, then for 
1 <i<mand | <j <n we can define f;; in Homr(U, V) by 
Uifizg = Vj, 


Unf; =O, k Fi. 
The elements f;; form a basis for Hom (U, V). 

Let V be any vector space over /. The vector space Homp(V, F) is 
called the dual space and will be denoted V*. Elements of V* are some- 
times called linear functionals on V. Suppose V is finite dimensional. Then 
the preceding remark implies that dime V* = dim-rV as dimrF = 1. More- 


over, if v,,...,VY, isa basis for V, then there is a basis f,,...,f, of V* 
such that vjf; = 1 and vf; =0,i #j. We call f,,..., fn the dual basis of 
v1, ...,Vy,. Even though V and V* are isomorphic, there is no isomorph- 


ism of V onto V* that is natural in the sense of not requiring a choice of 
basis to be made in V. We can also form V** = (V*)*. When V is finite 
dimensional, V** is also isomorphic to V, and this time the isomorphism 
is natural. 


THEOREM 11. Let V be a finite dimensional vector space. For v in 
V and f in V*, define T,(f) to be vf. Then 7, is in V** and vl -T, is an 
isomorphism of V onto V**. 


Proof. By definition, 7,(/) is in F, so 7, maps V* to F. If g is also in 
V* and a, b are in F, then 


T, (af + bg) = v(af + bg) = a(vf) + b(vg) = aT, (f) + bT,(B). 
Therefore 7, is in Hompr(V*, F) = V**. If w is another element of JV, 
then for all fin V* we have 


VECTOR SPACES 309 


Ty+w(f) = vt wf vf t+ wf 
=7,(f) + Ty (f) = (Ty + Tw) &). 


Therefore 7,+y = 7, + T, . Similarly, T,, = aT, for every every ain F. There- 
fore v->T, is a linear transformation of V into V**. Suppose v is in V and 
T, = 0. Then vf = O for all v in V*. If vy #0, we can find a basis v, = v, 
Vo, -.-, ym Of V. Let f,,...,f, be the corresponding dual basis of V*. 
Then vf, = 1, contradicting T, = 0. Thus vy = 0 and vy} 7, is injective. 
Since dimr V = dimer V**, it follows (See Exercise 7) that this injection is 
an isomorphism. 0 


The map 7, in Theorem 11 is called evaluation at v. Theorem 11 says 
that each element of V** is evaluation at a unique pv in V. 


EXERCISES 
1 Give an example of a generating set for Z? that does not contain 
a basis. 
Show that Z? contains proper Z-submodules isomorphic to Z?. 
3 LetA be the matrix 


3 1 5 2 6 
2 4 | 3.2 
5 0 6 6 44, 
0 2 0 1 3 
3 6 1 5 4 


considered over Z.. Find a set of rows of A that forms a basis for 
Sz (A). 


4 Show that the rows of the matrix 


7 l 3 ] 3 
are linearly independent over Z,, and find two more vectors that, 
together with the rows of A, form a basis for (Z,, )°. 


5 Let U and V be subspaces of a vector space of dimension 10. If 
dimrU =6 and dimgV=7, what are the possible values for 
dimr(U NV)? 


6 Let U=S7,(A) and V=Sz,(B), where 


310 


10 


11 


12 


13 


14 


15 


MODULES OVER EUCLIDEAN DOMAINS 


aN 
II 
| 
m= bo 
— fH oO f 
W W 
WwW 
Ld 


Compute dimz ,U,dimz,V,dimz,(U + V), and dimz (UN JV). 
Let U and V be vector spaces of the same finite dimension. Show 
that fin Home (U, V) is injective if and only if it is surjective. 

Give an example of an element of Endz(Z? ) that is injective but 
not surjective. 


How many bases does (Z, )” have, p a prime? 


Let uw = (1, 0) and v = (0, 2) in Z*. Show that there is an element 
of Endz(Z? ) that maps u to v but no element of End7(Z?) that 
maps v to uw. 


Let v,,...,V¥n and w,,..., Wy, be bases of the vector space V 
and let f,,...,/f, and g,,..., 8, be the corresponding dual 
bases of V*. Suppose P is the transition matrix from the v’s to 
the w’s. What is the transition matrix from the f’s to the g’s? 


Let U and V be vector spaces and let JT be in Home (U, V). For 
f in V* set ff* = T o f. Show that T* is in Homp(V*, U*). Sup- 
pose u,,..., Um isa basis for U and y,,...,, is a basis for 
V. Let g;,..,8m and f,,...,/Jn be the bases of U* and V* 
dual to the w’s and the v’s, respectively. Suppose A is the matrix 
for T with respect to the u’s and the v’s. What is the matrix for 
T* with respect to the f’s and the g’s? (We call T* the contra- 
gredient of T.) 


Let W be a subspace of the vector space V over the field F. The 


annihilator Ann(W) of W is the set of all f in V* such that wf = 
0 for all w in W. Show that Ann(W ) is a subspace of V*. 


Let V and W be as in Exercise 13 and assume that V is finite d1- 
mensional. Prove that the function that maps f in V* to the re- 
striction f|w is a linear transformation of V* onto W* with kernel 
Ann(W ). Conclude that dime W + dimg Ann(W ) = dimg V. 
Let U and W be subspaces of the finite dimensional vector space 
V. Show that 

Ann(U + W)=Ann(U) 0 Ann(W), 

Ann(U 9 V)= Ann(U) + Ann(W). 


SOLVING LINEAR SYSTEMS USING ROW OPERATIONS 311 


4. SOLVING LINEAR SYSTEMS USING ROW OPERATIONS 


Let R be a commutative ring. By a system of linear equations over R we 
mean a system of simultaneous equations of the form 


AyyX + AyaXq +... t+ aynXn = 5, 
Q19X4 + Ax9X4 +... F Any, Xn =b,, 

(*) 
Qm1%X1 + Qam2X 2 + eo ee » + Am nXn =Dy, 


where the a; and the b; are in R. A solution to the system (*) is a vector 
X = (x,,...,Xn) in R” whose components satisfy the equations. If we let 
A be the matrix of a;;’s and B the vector (b,,..., by), we can write (*) 
asA+,xX =8B or, simply, AX = B. We call A the matrix of coefficients for 
the system. If B = O, we say that the system is homogeneous. The APL 
proposition \/B=A+.xxX corresponds to the assertion that X is a solution 
of the linear system over R with A as the matrix of coefficients and B as the 
vector of constant terms. 

Linear systems need not have any solutions. For example, the system 


x+y=0, 
x+y, 

has no solutions, provided R is nontrivial, and the system 
2x + 6y =5 


has no solutions in Z, even though it has infinitely many solutions in Q. 
A linear system (*) that has no solutions in R” is said to be inconsistent. 
A homogeneous system A.X = 0 is always consistent, since X = Oisa solution. 

Linear systems arise in many different contexts, and their solutions 
can be interpreted in many different ways. Before discussing techniques 
for solving linear systems, we will examine a few examples that illustrate 
the importance of such systems. 


Example 1. Suppose f is in Homr(R”, R™). By Theorem 5.3.4, there is 
a unique n-by-m matrix A such that the image of X in R” under fis X+, xA. 
If we want to know the kernel of f, we are looking for vectors X in R” such 
that X+,xA = 0. Since the transpose of a vector in itself, we can write 
this condition as A’+,xX = 0. Thus, computing the kernel of f is equivalent 
to solving the homogeneous linear system whose matrix of coefficients is 


312 MODULES OVER EUCLIDEAN DOMAINS 


A‘, (The fact that the matrix is A’ and not A is the price we pay for writing 
our homomorphisms on the right.) 


Example 2. Let f and A be as in Example 1 and let B be an element of 
R™. A natural question to ask is whether or not B is in the image of R” 
under f. In other words, is there a vector X in R” such that X¥+,xA =B 
Or, equivalently, A‘X = B? Thus B is in the image of R” if and only if the 
linear system A’ X = B is consistent. 


Example 3. The homogeneous system AX = 0 can be written 


n 


> xAls/] =0 
j=1 


This system has a nontrivial solution if and only if the columns of A are 
linearly dependent over R. Thus, in solving the system AX = 0, we are 
answering the following question. Are the columns of A linearly independent 
and, if not, what are all the ways the zero vector can be written as a linear 
combination of the columns of A? 

Example 4. Let A and B be the integer matrices 


2 -l 3 3 2 —7 


| 5 —2 4 -] 2 


and let U = S7(A) and V = Sz(B). Suppose we want to describe UN J. 
A typical element of U is 


(x,y)+.xA = (2x +y,—-x + 5y,3x —2y) 
and a typical element of V is 
(Z,w)+. xB = (3z + 4w, 2x — w, —72 + 2w). 


If C is an element of UNM V, then there must be integers x, y, z, w such 
that (x, v)+,.xA =C=(z, w)+, xB. We can write the vector equality 


(x, y)+.xA=(@,w)+,xB 
in the form 

2x + y—3z-—4w=Q0, 

—x +5y—2z+ w=QO, 


3x —2y + 7z—2w=0. 


SOLVING LINEAR SYSTEMS USING ROW OPERATIONS 313 


Each vector C in UN V gives rise to one or more solutions to this system. 
Conversely, if (x, y, Z, w) is a solution to this system, then C= (x, y)+.xA 
is in UM V. If A and B are any matrices over R with n rows, then finding 
Sr(A) M Sp(B) is equivalent to solving the homogeneous system whose 
matrix of coefficients is A’, (—B’), where we use the APL notation for 
catenation to describe the matrix with n rows whose columns consist of 
the rows of A followed by the negatives of the rows of B. 


Example 5. Interpolation of polynomials may be viewed as solving linear 


systems. Suppose we are given elements dy,...,@m and bo,..., 0, of 
R and we want to find a polynomial f= cy +c,X+...+c¢,X”" such that 
f(@) = b;, 0 < i < m. Then the vector (co, .. ., Cy) is a solution of the 
system 
Co tage, tasc, +...+ abc, =bo, 
Co $410, tatc, +... +atc, =b,, 
Cot dm, +a2,¢, +... t+ ancy =dm. 


If m =n, then the matrix of this system, 


I ao ao ao 
l Ao ar a’ 

v 
] Ay a? a a 


is called a Vandermonde matrix. [Alexandre T. Vandermonde (1735-1796) 
was a French mathematician. ] 

The general form of the set of all solutions to a linear system is not 
hard to describe. 


THEOREM 1. Let A be an m-by-n matrix over a commutative ring 
R and let B be a vector of length m over R. The set U of all solutions of 
the system AX = O is a submodule of R”. If the system AX = B is con- 
sistent, then the set of all solutions of AX = B isa coset of Uin R”._ 


Proof. Let 7:R"—>R™ be the R-homomorphism taking X to 


314 MODULES OVER EUCLIDEAN DOMAINS 


X+,xA‘. Then X is in U if and only if XT = 0. Therefore U is the kernel 
of T and so is a submodule of R”. The system AX = B is consistent if and 
only if B is in the image of 7. If AX = B is consistent, then the set of solu- 
tions is {B}T —, the inverse image of {B} under T. If {B}7 + is nonempty, 
then, by Theorem 3.5.3, {B}T is the coset of U+ Xo, where Xo is any 
element of {B}T™. QO 


If in Theorem 1 the ring R isa PID, then U is a free module and can be 
described by giving a basis u,,..., uU,. Once we have found one solution 
Xo of AX = B, then any solution X of this system has the form X = X9 + 
C,u,+...+c,u,, where the c; are unique elements of R. 

We can now add to our list of properties that characterize invertible 
matrices over a field. 


THEOREM 2. Let F be a field and let A be in M, (F). The following 
are equivalent. 


(a) det A #0. 
(b) The system AX = B is consistent for every B in F”. 
(c) The only solution of the system AX = 0 is X = 0. 


Proof. Let 7:F"—>F" be the linear transformation taking X to AX = 
XA!*. Condition (b) asserts that T is surjective and condition (c) states 
that T is injective. These two conditions are equivalent by Corollary 6.3.9. 
Now suppose det A # 0. Then X = A™'B is a solution of AX = B, and 
(b) holds. If det A = 0, then det A’ = O and, by Theorem 5.2.12, the rows 
of A’ do not form a basis of F”. But then Corollary 6.3.6 shows that the 
rows of A’ are linearly dependent. Thus there exists a nonzero vector X 
in F” such that XA'=AX=0. O 


COROLLARY 3. Let do, ..., a, be elements of a field F and let 
A be the Vandermonde matrix 


n 
l ag as, _— av 
R 
l ay a’ a ay 
n 
1 an a2 wee an 


Ifdo,..-.,@ are distinct, then det A #0. 


Proof. Let B = (bp, b,,..., b,) be in F”. Then, by the discussion in 
Example 5, solving the system AX = B is equivalent to finding a polynomial 


SOLVING LINEAR SYSTEMS USING ROW OPERATIONS 315 


f in F[X] of degree at most n such that f(a;) = b;,0 <i<n. By the Lagrange 
Interpolation Theorem (Theorem 4.12.1), such a polynomial exists if the 
a; are distinct. In this case the system AX =B is consistent for all B and, by 
Theorem 2,detA #0. 0 


Given a linear system AX = B, it would be useful to have a method for 
obtaining another system A’ Y = B’ with the following properties: 


1. The solutions of A’ Y = B’ are easy to describe. 


2. The solutions of AX = B can be constructed in a simple manner 
from the solutions of A'Y = B’. 


In this section we will discuss one method for getting such a system A'Y = 
B’. This method uses row reduction and is quite satisfactory when R is a 
field. However, when R is not a field, a more general approach must be used. 
This second approach is described in Section 7. 

The following theorem provides the foundation for our first method 
of solving linear systems. 


THEOREM 4. Let A be an m-by-n matrix over R, let B be in R™ , and 
let P be in GL,,(R). Set A’ = PA and B’' = PB. Then the systems AX = B 
and A'X = B' have the same solutions. 


Proof. Suppose X in R” satisfies AX = B. Then A’X = (PA) X = P(AX) = 
PB =B'. Similarly, if X satisfies 
A'X=B', then AX =(PTA')X=P7(A'X)=P'1B'=B. 0 
Let A and B be as in Theorem 4. In APL notation the matrix A, B is 


m-by-(n + 1) and is formed from A by adding B as the (n + 1)th column. 
We call A, B the augmented matrix of the system AX = B. 


COROLLARY 5. If the augmented matrices for two systems AX = 
B and A’X = B’' are row equivalent, then the systems have the same solu- 
tions. 


Proof. If A, B is row equivalent to A’, B’, then there is an invertible 
matrix P such that A’, B’ = P(A, B). But P(A, B) = (PA), PB, so A’ = PA 
and B’ = PB. By Theorem 4, the systems AX = B and A'X = B' have the 
same solutions. U 


Let us illustrate Corollary 5 with an example. Suppose we wish to 
solve 


3x +2y+8z= 9, 
4x + 6y + 3z= |, 
7x +4y + 7z= 10, 


316 MODULES OVER EUCLIDEAN DOMAINS 


considered as a system over Z,,. If we row reduce the augmented matrix 
for this system, 


N<11 
0<A4+3 403 2894631744 7~10 
3 2 8 9 
4 6 3 1 
7 uw 7 10 
ZNROWREDUCE A 
3 
0 
0 


we see that our system has the same solutions as the system 


x +27=3, 
y+z=0, 
O=0. 


The solutions are now obvious. We may choose z arbitrarily and then set 
x = 3 — 2z and y = -z. 
If we now consider the system 


8x +3y + 82=8, 
Ix +5y+10z=1, 
3x + 3y + 10Z=7, 


over Z,,, then the same procedure 


U<+A<+3 408 3 8 8 7 5101 3 3 10 7 


8 3 8 8 
7 5 10 1 
3 3 10 7 
ZNROWREDUCE A 
1 04 0 
O13 0 
00 0 1 


shows that the system has the same solutions as 
x +42=0, 
y + 3z=0, 
O=1. 


SOLVING LINEAR SYSTEMS USING ROW OPERATIONS 317 


This system is clearly inconsistent and thus so is the original system. 
We can now describe how to solve any linear system over a field. 


THEOREM 6. Let AX = B be a linear system over a field F with m 
equations in n variables. Assume further that A, B is row reduced over F 
and that the corner entries of A, B occur in columnsj, <j, <... <j,. Set 
K={1,...,n3— {f,,..., J,}. The system AX = B is consistent if and only 
if j, < n. If this is the case, then one solution is given by C=(c,,...,¢C,), 
where c;, = b;, 1 <i <r, and c, = 0 fork in K. Ifk is in K, then let U, be 


the vector (uz1, ...,Ukn), Where uz, = 1, Ugg = O for 2 in K — {k} and 
Upj, = —Aiz, | <i <r. The set {U;, |k e€ K} isa basis for the set of solutions 
of AX = 0. 

Proof. If j, =n + 1, then the rth equation in the system AX = B is 
Ox, +... + 0x, = 1, which certainly has no solutions in F”. However, if 


j, <n, then the last m — r equations are of the form 0 = O and so may be 
ignored. For 1 <i <r the variable x;, Occurs with a nonzero coefficient 
only in the ith equation, which we may write in the form 


x;. = b; — AjRXk- ** 
Vi 1 a ikXk (**) 


Thus we may choose arbitrary values for the variables x, with k in K and 
use (**) to solve for the x;,. Taking x, = O for all k in K gives the solution 
C. In solving the homogeneous system AX = O we may set b; = O in (**). 
Now the vector U; is the solution corresponding to the choice of x, = 1 
and x, = O for in K — {k}. The form of (**) implies that the only solu- 
tion of AX = 0 with x, = O for all k in K is the trivial solution X = 0. Sup- 
pose for each k in K we have an element c; in F such that 


d cyU, = 
keK Chk =O. 
For 2 in K the th component of the sum on the left is 

> Cruge = Cg. 

keK 
Thus cg = O for all & in K, and the U, are linearly independent. Finally, 
suppose Y = (y;,..., Yn) is any solution of AX = 0. We will show that 
Y = Z, where 

Z= > Vk U, . 

keK 


If 2 is in K, then the &th component of Z is yg. Thus Y — Z is a solution 
of AX = 0 whose &th component is O for all 2 in K. By our previous re- 


318 MODULES OVER EUCLIDEAN DOMAINS 
mark, this means that Y — Z =O or Y = Z, Therefore the U;, form a basis 
for the set of solutionsof AX=0. UO 


Let us illustrate Theorem 6 with an example over Q. The augmented 
matrix for the system 


x—2y +3w=2, 


z—sSswe=l, 


! _2 0 3 | 
0 0 1 —5 1|¢ 


which is row reduced over Q. In the notation of Theorem 6 we have j, = 
l, jz = 3, and K = {2, 4}. Thus C = (2, 0, 1, 0) is a solution of the system, 
and the vectors U, = (2, 1, 0, 0,) and U, = (—3, O, 5, 1) form a basis for 
the vector space of solutions to the corresponding homogeneous system 


is 


x—2y +3w=0, 
z—5w=0. 


Let us now use the method of Theorem 6 to solve the system 


2x, + 6x, + %X3 + 4x. =1, 
3x, + 2x, +2x3+ x4 +6x, + x6 =5, 
4x, +5x, + 3x4 + 5x. = 6, 
X, +3x, + 2x3 4+3x4+3x5 + 4x, =5, 


considered as a system over Z,. Entering the coefficients and the vector of 
constant terms and row reducing the augmented matrix, we get 


N<7 
(<+A+4 602 6104032216124 5° 035 013 23«3 «*4 


re FW RN 
Oman ® 


6 5 
NROWREDUCE A,B 


OOOR 
0O00O0O0Ww 


SOLVING LINEAR SYSTEMS USING ROW OPERATIONS 319 


The last corner entry of A1 is not in the last column, so the system is con- 
sistent. A solution is given by C, constructed as follows. 


C+6 00 7|A+.xC 
CL1 3 5]<3t+41£;7] 15 6 5 
C 


20001 0 


We can now construct a matrix U whose rows form a basis for the solutions 
of the corresponding homogeneous system. 


U<+3 600 
UL3;2 4 6]«(13)°.=3 


) 
) 
OO OO 


0 0 
0 0 
0 1 
13 5)<7|-@A1[01 2 332 4 6] 


jo 
) 
o1 O1 © 
OrR © 
mn OO O 
Oo © 


We leave it as an exercise to verify that this construction is the same as that 
described in Theorem 6. 

We can now, as promised, provide a better algorithm for solving the 
following problem. Let A. be a matrix over the field F. Choose a set of rows 
of A that is a basis for S;(A). 


THEOREM 7. Let A be an m-by-n matrix over the field F and let 
B be the row-reduced matrix that is row equivalent to A’. Then the row 
A[i;] is a linear combination of A[1;], ..., A[i—1;] ifand only if B[;i] 
does not contain a corner entry. Suppose the corner entries of B occur in 
columns j, <... <j,. Then A[j,;], ...,ATj,;] form a basis for Sr(A). 


Proof. In Example 3 we noted that x,A[1;] +...+x,,Al[m;] = 0 if 
and only if X = (x1, ...,Xm) satisfies A'X = 0.Now A[i;] is a linear com- 
bination of A[1;],..., Ali — 1;] if and only if we can find such an X with 
x; = 1 and x, = 0 for k >i, The system A’X = 0 has the same solutions as 
BX = 0. If B[;i] does not contain a corner entry, we may set x; = 1 and 
x; = 0 for all other j such that B[;7] does not contain a corner entry and 
obtain a solution X of BX = 0. If jg >i, then x;, = —B[R;i] = 0', sox, =0 
for all kK >i. Therefore A [i ;] isalinear combinationof A[1;],...,Afi-— 1;]. 

Now suppose B[;i] does contain a corner entry B[k;i] = 1. If X = 


320 MODULES OVER EUCLIDEAN DOMAINS 


(x1,.+-,Xm ) satisfies BX = 0 and x; = 0 for all k >i, then 
O= » BUj Kl x, =%;, 


since x, = O for k >i and B[j;k] = 0 fork <i. Thus A[i;] is not a linear 
combination of A[1;], ...,A[i—1;]. Therefore A[j,;],..., Alj,;] span 
S,(A), and each of these rows is not in the subspace spanned by the pre- 


ceding rows. Hence A[j,;], ..., Alj,;] are linearly independent and form a 
basis for Sr(A). O 


In Section 3 we determined that the first, second, and fourth rows of 
the matrix 


U<+A<4 402 1387416492008 4% 5 2 8 


213 8 
7 4 1 6 
9 0 0 3 
4 5 2 8 


over Z,, form a basis for Sz,,(A). Using Theorem 5, we can obtain this 
same result as follows. 


N<11 
0+B<+ZNROWREDUCE XA 
1030 
0120 
0001 
000 0 


The corner entries of B occur in columns 1, 2, and 4. 


EXERCISES 
1 Show that the Vandermonde determinant 
1 x x? 
1 yy? 
1 zz? 


is equal to (v — x) (Zz —x)(zZ— y). 

2 Use the method of Theorem 6 to determine which of the follow- 
ing systems are consistent over Z,. For each consistent system find 
one solution and a basis for the solution space of the corresponding 
homogeneous system. 


SOLVING LINEAR SYSTEMS USING ROW OPERATIONS 321 


(a) 2x, + 3x, =4, 


4x, + x, =2. 
(b) 2X1 + 3x, =4, 
4x, + x2 =3. 


(c) 2x, + x2, + 2x3 =4, 

3x, + 2x, + 2x3 =2, 

4x, + 3x, + 4x3 = 1. 
(d) 3x, +4x, + x3 4+ 4x, =4, 

2X, + x, + 2x3 + 4x, =0, 

4x1 + 2X4 + 3X3 + 2X 4 = 0. 
For the following system over Q, use the method of Theorem 6 to 
find one solution and a basis for the solution space of the cor- 
responding homogeneous equation. 


3x, + Xx, 43x; = 0, 


2X 4 + 2X 4 + 3x3 = —4/3, 
— 6x1 + 3x» =_—5, 
Let A be a matrix that is row reduced over R. Assume that the vec- 
tor J lists the columns containing the corner entries of A. Using 
one or more APL statements, describe the construction of a matrix 


U whose rows form a basis for the solution space of the system 
A/Q=At+,. xX over R. 


Let F = Z,, and let g be the element of Homr(F°, F°) whose 
matrix with respect to the standard bases is 


3 = 10 l 9 4 5 
5 2 4 8 2 8 
8 l 5 6 | 9 
2 3 7 9 3 610 
l 7 6 8 10 9 


Find a basis for the Kernel of g. 
The coefficient matrix for the integral system 
2x, + 2x, +x;,4+%x,=0, 
3x5 +x3—X, =0. 


is row reduced over Z. Show that any solution of this system with 
x, = 0 is an integral multiple of U = (—1, —2, 6, 0) and any solu- 


322 


10 


11 


12 


MODULES OVER EUCLIDEAN DOMAINS 


tion with x3 = 0 is an integral multiple of V = (—5, 2, 0, 6). Show 
that the submodule of Z* consisting of the solutions of this sys- 
tem contains < U, V > properly. Conclude that the method of 
Theorem 6 cannot be applied conveniently to systems over Z. 


Let A be in M,,(R), where R is commutative, and assume det A 
is a unit in R. Prove that any system AX =B has a unique solution 
X = (*1,...,Xn), where x; = (det A™)/(det A) and A™ is ob- 
tained from A by replacing A[;i] by B. [This result is known as 
Cramer’s Rule, Gabriel Cramer (1704-1752) was a Swiss mathema- 
tician. ] 


Suppose we are given the structure constants e;,, 1 < i,j,k <1, 
for an algebra A over a commutative ring R with respect to a basis 
X1,+..+,Xn Of A. Let 1 =u,x, +...+4u, x, in A, where the 
u; are in R,. Show that we can find the uw; by solving a linear system 
over R. Suppose v,,...,¥, are given elements of R. Prove that we 
can find the inverse of x = v,x; + ...+¥,x, in A by solving a 
linear system over R, assuming x has an inverse. 


Let A and B be matrices over a commutative ring R whose rows 
form bases for two submodules U and V, respectively, of R”. 
Show that UN V is isomorphic to the module of solutions to the 
system CX = 0, where C = A’, (—B‘) is formed by catenating the 
transposes of A and —B. Describe the isomorphism explicitly. Use 
this method to find a basis for Sr(C, ) NO Sr(C, ), where F = Z, and 


o. [2 1 4 «0 4 2 1 3 
; 3 2 3 2/1, G=A}2 3 2 1 
1 2 1 2 1 3 1 2 


Let F be a field and let V be a subspace of F” . Show that there is 
a homogeneous linear system AX = O over F whose solution space 
is V. If F = Q, prove that A can be chosen to have integral entries. 


Let F = Z,3. Find a linear system over F whose solution space is 
Sr (A), where 


2 ] 9 7 4 
A= | 8 3 10 5 12 
7 6 2 1] 7 | 
Let F be a field and let U, and U, be subspaces of fF”. Suppose 
U, and U, are the solution spaces of the systems A,X = O and 


A,X = QO, respectively. Describe a linear system whose solution 
space is U; M U,. Use this method to find a basis for Sr(C,) N 


FINITELY GENERATED MODULES 323 


Sr(C,), where fF, C,, and C, are as in Exercise 9. 

13. Let F be a field and let AX = B and A'X = B' be two linear systems 
over F’, each with m equations in m unknowns. Suppose both sys- 
tems are consistent and have the same solutions. Prove that the 
augmented matrices A, B and A’, B’ are row equivalent. 


5. FINITELY GENERATED MODULES 


In this section we will be interested in the following questions. Suppose we 
are given two finitely generated R-modules M, and M.. How can we decide 
if M, and M, are isomorphic? For example, the seven groups 


Zo Z, ®Z;, OZ, 

Z, ® Zi Z, ®Z, OZ, 

Z; ®Zs Z, ®Z, ®Z, OZ; 
Z, OZ, 


are all Z-modules of order 24. What are the isomorphisms, if any, among 
these groups? By Corollary 3.6.2 we know that Z, ® Z, = Z,,,, whenever 
gcd(m,n) = 1. Therefore 


Z, OZ, =Za, 
Z, ®Z; OZ, =~Z, OZ, =Z, OZy, 
Z, ®Z, ®Z, OZ; =%Z, OZ, OZg. 


It is a fact that there are no additional isomorphisms among these groups. 

One could also ask if any abelian group of order 24 must be isomorphic 
to one of the listed groups. As a corollary of the main theorem of this sec- 
tion, we will show that any finite abelian group is isomorphic to a direct 
sum of cyclic groups, and in Section 6 we will prove that any two such 
direct sums are isomorphic if and only if they can be shown to be isomor- 
phic by repeated use of Corollary 3.6.2. 

Before we can discuss algorithms for determining the isomorphism of 
two given modules, we must clarify what it means to be ‘“‘given” a module. 
Let M be an R-module generated by elements u,,...,u,. The map 
f:R"°—>M taking (a,,...,4,) toad,;u, +...+4a,u, isa surjective R-homo- 
morphism. Therefore M is isomorphic to R"/K, where K is the kernel of 
f. If R is a PID, then K is finitely generated and there is a matrix A over 
R with n columns such that K = Sp (A). We will normally consider a module 
M to be given if we know a matrix A such that M = R”/Sp(A), where 


324 MODULES OVER EUCLIDEAN DOMAINS 


n is the number of columns of A. Such a description of M is called a finite 
presentation. 

Let us consider some examples with R = Z. Since Z, is Z/nZ, a finite 
presentation of Z, is given by the 1l-by-1 matrix whose entry is n. What is 
a finite presentation for G = Z, ® Z,? The elements of G are pairs 


({a]q, [b]6 ), 
where [m],, denotes the congruence class m +nZ. The map 
(a,b)—(la]q [bl] ) 


is a homomorphism of Z? onto G, The kernel consists of all (a, b) with 
a =0 (mod 4) and b =0 (mod 6). Thus G = Z? /Sz(A), where 


2 0 0 
D= | 0 3 0 
0 0 4 


If we allow R to be any ring, then there is no known algorithm for 
deciding whether R™/Sp (A) is isomorphic to R”/Sp(B). However, when 
R is a Euclidean domain, such an algorithm exists. This algorithm is one 
of the most important algorithms in algebra. 

Let A and B be two m-by-n matrices with entries in R. We say that 
A and B cre equivalent over R if B can be obtained from A by a sequence of 
row and column operations over R. For example, if 


D<+A<B<+2 205 124 


5 1 
2 im 
BL31]<BL3;1]+5xBL;2] 
B 
oO 41 
22 Ly 
BL2;1]<BL2; ]+4xBL1; ] 
B 
Oo 41 
22 @) 
D<B<+oB 
4 0 


FINITELY GENERATED MODULES 325 


then the final matrix B is obtained from A by an integer column operation, 
an integer row Operation, and another integer column operation. Thus 
A and B are equivalent over Z. Clearly, equivalence of matrices is an equiv- 
alence relation. 


THEOREM 1. Let A and B be equivalent m-by-n matrices over R. 
Then B = PAQ, where P and @Q are units in M,, (R) and M, (R), respectively. 


Proof. Row operations can be performed by multiplying on the left 
with elementary matrices, and column operations can be performed by 
multiplying on the right with elementary matrices. Therefore 


B=P,...P,P,AQ,Q, ree Q,, 
where the P; and Q; are elementary matrices. If P=P,... P,P, and Q= 


Q,Q0,...Q,, then B = PAQ and P and Q are units in M,, (R) and M,(R), 
respectively. OU 


The matrix P in Theorem 1 can be obtained by applying the row 
operations corresponding to the elementary matrices P,, P,,...,P, to the 
identity matrix. Similarly, Q can be obtained by applying the column 
operations corresponding to the Q; to the identity matrix. In our preceding 
example we can compute P and Q as follows. 


P+Q<(12)°.=12 U<«Q<+oQ 
P£L23;1<PCL23;1+4xPLi;] oO 1 
P 1 5 


FR 
hr © 
> 
© 


Q£3;1)<Q@0L31)+5xQL;2] 0 22 


The next result provides a basis for showing that two quotient modules 
of R” are isomorphic. 


THEOREM 2. Let N be a submodule of an R-module ™ and let f be 
an R-automorphism of M. Then Nf is a submodule of M and M/(Nf) is iso- 
morphic to M/N. 


Proof. By Theorem 5.1.5, we know that Nf is a submodule of M. Let 
g:M—M/N be the natural map and set h =f! o g. Then, since both f~* and 
g are surjective, h maps M onto M/N. If u is in Nf, then u = vf for some 
v in N. Thus uh = v(fof o g) = vg = 0. Therefore Nf is contained in the 
kernel of h. Conversely, if u is in the kernel of h, then (uf~)g = O and 
uf~! is in the kernel of g, which is N. Hence u = (uf !)f is in Nf and the 
kernel of h is Nf. By the First Isomorphism Theorem, //(Nf) is isomorphic 
toM/N. 0 


COROLLARY 3. Let A be an m-by-n matrix over R and let P and 


326 MODULES OVER EUCLIDEAN DOMAINS 


Q be units in M,,(R) and M,(R), respectively. Then R”/Sp(A) and 
R” /Sp (PAQ) are isomorphic. 


Proof. By Theorem 5.2.6, we have Sp (PAQ) = Sp (AQ). Now the map 
f:X—X@Q of R” into itself is an endomorphism of R” and, by the iso- 
morphism of Corollary 5.3.7, f is invertible and so f is an automorphism of 
R”. The rows of AQ are the images under f of the rows of A and hence 
Sp(AQ) = Sp(A)f. The isomorphism of R/Sp(AQ) with R/Sp(A) now 
follows from Theorem 2. U 

COROLLARY 4. Let A and B be equivalent m-by-n matrices over 
R, Then R"/Sp (A) and R” /Sp (B) are isomorphic. 

Proof. By Theorem 1, there are invertible matrices P and Q such that 
B = PAQ. By Corollary 3, the modules R”/Sp (A) and R”/Sp (B) are iso- 
morphic. U 


There are certain m-column matrices D for which the structure of 
R" /Sp(D) is easy to describe. The next theorem generalizes some earlier 
remarks. 


THEOREM 5S. Let D be the m-by-n matrix 


dy 


© 


0 


where d,,...,4d, are elements of R. Then R”/Sp (D) is isomorphic to 
M=(R/Rd,)®...®(R/Rd,) ® R"~". 


Proof. If 1 < i <r, then set R; = R/Rd,;, and if r <n, then set R,+41 = 
...=R, =R. Thus we can writeM=R, ®...@®OR,. The map f taking 
(a,,...,4,)in R” to 


FINITELY GENERATED MODULES 327 


(la; ]a,,-- a A [a,)a.,4r+1,- ..,4n) 


is a homomorphism of R” onto M. (Here we are writing [a;]4, for a; + 
Rd;, 1 < i < r.) The kernel K of f is the set of vectors (a,,...,a,,0,...,9) 
such that d; divides a;, 1 < i < r. Thus K = Srp(D) and M=R”/Sp(D), O 


A matrix D with entries in R is said to be reduced if it has the form 


ay 
dy 


where the d; are nonzero elements of R and d; divides d;4,, 1 < i<cr. We 
will show that, whenever RK is a Euclidean domain, any matrix over R is 
equivalent to a reduced matrix. Before proving this result, let us illustrate 
the basic idea with an example over the integers. Let 


<A+3 3p0 715 748 2 511 °5 


~7 15 _7 
uy 8 2 
~5 11 5 


The following sequence of integer row and column operations converts 
A into a matrix that is reduced over Z, 


328 MODULES OVER EUCLIDEAN DOMAINS 


AL33;]<AL3;]-3xAL[2;] 


A 
“7 15 7 
nm 8 2 
~17 = °13 1 
AL1 23;]<AL1 23]+7 2°.xAL33] 
A 
126 £76 0 
~30 #4418 0 
“17 «§13 1 
AL3;1 2]<AL31 2]+AL33]°.x17 13 
A 
~126 £76 0 
~30 18 0 
0 0 1 


ACL1 33;J<AL3 13] 
AL31 3]+AL;3 1] 
A 
1 0 @) 
0 18 — 30 
0 76 126 
AL3; J+AL3;]-4xAL23] 


A 
1 0 0 
0 18 30 
Oo 4 6 
AL2;3;1<AL2;]-4xAL3;] 
A 
1 O 0 
0 2 6 
Oo 4 6 
AL33;1+AL3;1]-2xAL2; ] 
A 
1 O 0 
0 2 6 
0 oO 6 
AL3;3]<AL3;3]-3xAL32] 
A 
1 O 0 
0 2 0 
0 O 6 


THEOREM 6. Let R be a Euclidean domain. Every matrix with entries 
in KR is equivalent over R to a reduced matrix. 


Proof. Let N be the Euclidean norm on R. Thus N(a) is a nonnegative 


FINITELY GENERATED MODULES 329 


integer for all a #0 in R. Let A be an m-by-n matrix over R. We will de- 
scribe a procedure for converting A into a reduced matrix using row and 
column operations. As with the row-reduction procedure of Theorem 
1.8, this procedure is recursive, and the proof of its correctness proceeds 
by induction on m. 


If A = 0, then A is reduced and we are done. Therefore we may assume 
A#0. 


LEMMA 7. There is a matrix B equivalent to A over R such that B 


has the form 7 
x O...O90 


0 
C ' 
0 
where x #0 and x divides every entry in C. 


Proof. Since A #0, there is a nonzero entry in A. Among all nonzero 
entries in A, let x be one such that N(x) is minimal. We will prove the lemma 
by induction on N(x). Let x =A[i;/]. 

CASE 1. Suppose x divides every entry in A. [Note that this case must 
happen if N(x) = N(1).] Interchanging A[1;] and A [i;] and then interchang- 
ing A[;l1] and A[;7], we may assume i =/ = 1. Fori> 1 we can write A [i;1 ] 
as e;x for some e; in R. Subtracting e; times A[1;] from A [i;], we convert 
A into a matrix of the form 


0 
and x still divides every entry in A, Forj > | let A[1;7] = f,x with f in RK. 
Subtracting f; times A[;1] from A[;j], we now have A in the form 


x O...0 
0 


C 


330 MODULES OVER EUCLIDEAN DOMAINS 


and x divides every entry in C. 

CASE 2. There is some entry A[k;&] not divisible by x. Suppose 
2 =j so that x and A[k;] are in the same column. Write A[k;2] =qx tr, 
Where r 4 O and N(r) < N(x). Subtracting g times A[i;] from A[k;] makes 
A[k;2] now equal to r. By induction on N(x), we can convert A to the 
required form. If i = k, we proceed in a similar manner. Thus we may as- 
sume that i# k and j = 2 and that x divides A[i;2]. Write A[i;2] as ex, 
with e in R, and subtract e — 1 times A[;j] from A[;2]. Now A[i;&] is 
ex — (e — 1)x = x, so we can replace j by 2 and continue as in the & =j 
case. U 


Having proved Lemma 7, we may assume A has the form 


d, 0... 0° 
0 


0 
Where d, divides every entry in C. By induction on m, the matrix C is equiv- 
alent to a reduced matrix 


da, 
0 
Cc’ = d, 
0 
0 , 
It follows (why?) that A is equivalent to 
ay 
d 

? 0 

D= d, 


FINITELY GENERATED MODULES 331 


This matrix D is reduced if d, divides d,. The entries in C’ are R-linear com- 

binations of the entries in C. Since d, divides all of the entries in C, it fol- 

lows that d, divides d,. Thus A is equivalent to a reduced matrix. U 
Theorem 6 has some very important corollaries. 


COROLLARY 8, Let M be a finitely generated module over the Eu- 
clidean domain R. There exist nonnegative integers r and s and nonzero 
nonunitsd,,...,d, of R such that d; dividesd;+;, 1 <i<r,and 


M = (R/Rd,)®... ®(R/Rd,) ®R*. 


Proof. Let M be generated by n elements. Then M = R” /Sp (A), where 
A is some m-by-n matrix over R. By Theorem 6, there is a matrix D equiv- 
alent to A over R such that 


dy 


where the d; are nonzero and d; divides d;,,, 1 < i <r. By Corollary 4 and 
Theorem 5, 


M=R"/Sp(A)=R"/Sp (D) = (R/Rd,) ®... ® (R/Rd,) ORS, 
where s = n — r. If some d; is a unit, then Rd; = R and R/Rd; = {0}. In 
this case the summand may be omitted. U 

The isomorphism M = (R/Rd,) ®...©® (R/Rd,) ® R* in Corollary 
8 is called a cyclic decomposition of M. The quotient modules R/Rd; are 
said to be cyclic, since they are generated by one element. 

COROLLARY 9. Let A be a finitely generated abelian group. Then 


there exist nonnegative integers r and s and integers d,,..., d, greater 
than 1 such that 


A=Z,,®...O0Z4, OZ. 


Proof. The ring Z is a Euclidean domain. Applying Corollary 8 to A, 
we see that there are integers 7, s, d,,...,4d, such that |d;| = 2 for all 
i,d; divides d;41,, | <i<r,and 


A = (Z/Zd,)®...® (Z/Zd,) ® Z°. 


332 MODULES OVER EUCLIDEAN DOMAINS 


Since Zd = Z(—d), we may assume the d; are positive. L 


In Corollary 9 the integersr,s,d,,...,4d, are actually uniquely de- 
termined by A. We will prove this fact in the next section. 


COROLLARY 10. If F is a field and M is a finitely generated module 
over R=F[X], then 


M=(R/Rf,) ®...® (R/Rf,) ® R* 
where f; are monic polynomials of positive degree and f; divides f;41, 1 < 
i<r. 
Proof. This follows from Corollary 8. We may multiply the f; by any 


unit of R and thus we may assume each f; is monic. If some f; = 1, then 
R/Rf; has order 1 and may be omitted. U 


The workspace CLASSLIB contains procedures for reducing matrices 
over Z and Z,[X]. Here m must be a prime. As an example, we can use 
ZREDUCE on the preceding matrix A. 


A ZREDUCE A 
“7 15 79 10 0 
Mm 8 2 0 2 0 
“5 11 °° 5 00 6 


By Theorem 5, Z?/Sz(A) is isomorphic to Z, ® Z,. The procedure 
ZREDUCE computes two invertible integer matrices & and S such that 
ZREDUCE AisR+.xAt+.xS. 


R R+.xAt+.x$ 
4 3 0 100 
m 1 6 02 0 
g 1 13 00 6 
2 
0 o 1 
0 1 3 
1 9 8 
The matrix 


1—xX 2 
B= 
] oy 


over R = Z.[X] can be reduced as follows. 


FINITELY GENERATED MODULES 333 


_ DAZV B+2 2 2p1 120101 °1 


1 1 2 0 
1 0 1 1 
N<7 
DAZV ZNXREDUCE B 
10 0 0 0 0 
00 0 65 1 


Thus R? /Spz (B) is isomorphic to R/Rf, where f=6+5X+X?. 
We close this section with another important corollary of Theorem 6. 


COROLLARY 11. Let A be an n-by-n matrix over Z. Then Sz (A) has 
finite index in Z” if and only if det A #0. If det A #0, then 


{Z” :S7(A)| = |det Al. 
Proof. We can find a reduced matrix 


dy 


row equivalent to A. The d; may be assumed positive. There exist units 
P and Q in M,,(Z) such that D = PAQ. Since | det P| = |det O| = 1, we have 


O, r<n, 


det AJ= det D= 
| | d,...dn, r-fn, 


Now Z” /Sz (A) = Z"/Sz(D). If r <n then Z”/Sz(D) is infinite while, if 
r=n, then Z/Sz7(D)= Zz, ©... ®Zyz, hasorderd,...d,. U 


EXERCISES 


1 Find all isomorphisms derivable from Corollary 3.6.2 among the 
following groups. 


Zx6 Z,®Z, ®@Zz 
Z, ® Lis Z, ®Z; OZ, 
Z; ®Zy Z,®Z,®Z, 
Z, ® Zo Z, ®Z, @Z; OZ; 


334 


10 


MODULES OVER EUCLIDEAN DOMAINS 


By considering elements of largest order, show that Z,,, Z, ® Z15, 
and Z, ® Z, ® Z, are nonisomorphic. 


Reduce by hand each of the following matrices over the integers 
to a reduced matrix. 


(a) |10 6 (b) [2 -1 3 
8 4 4 3 5 
(c) 3 2 4 


O -l J 
2 3 2 


For each matrix A in Exercise 3 find invertible integer matrices 
P and Q such that PAQ is reduced. 


Execute ZREDUCE ?4 4010 ten times. What do the results 
indicate about the probability that Z”/S7(A) is cyclic, where A is 
a randomly chosen square integer matrix? 


Reduce the matrix 


1+ X 3 2+X 
B= xX 44+ 2X J 
4 1+ 3X 2X 


over R = Z;[X] to a reduced matrix using ZVXREDUCE. What is 
the order of R?/Sp (B)? 


Reduce the matrix 
3 +i 2+ 4i 
2i —l] + 3i 


to a reduced matrix over Z[i] . 
Let m and n be integers and set d = gcd(m, n) and & = Ilcm(m, 7). 


Show that 
m 0 d 0 
and 
0 n 0 Q | 


are equivalent over Z. 
Let R = Z[i] and z = 2 + 3i. Find a Z-basis for Rz. Show that 
|R/Rz| is finite and determine its order. 


Generalize Exercise 9 to the case in which z =a + bi is any nonzero 
element of Z[i]. 


FINITELY GENERATED MODULES 335 


Suppose G is an abelian group generated by elements a, b, c such 
that 
| 2a — 3b + 4c=0, 
4a + 5b — 6c=0, 
—3a + 2b + 3c=0. 
Show that G is finite with order dividing 128. 


Show that every square matrix over a Euclidean domain is equiv- 
alent to its transpose. 


Let 
3 1 —-l 0 
A= | -2 0 | | 
0 3 2 ] 

and 


B= ] 4 2 2 
—7 2 5 3 
Set U= S7(A) and V = Sz(B). 


(a) Show that the rows of A are a basis for U. 
(b) Show that V C U and find an integer matrix C such that 


B=C+.*xA. 
(c) Show that U/V is isomorphic to Z>/Sz(C) and determine 
|U/V|. 


Suppose a sequence of row and column operations reduces a square 
matriz Z to the identity matrix /. Is it true that the same sequence 
of row and column operations will transform J into A7!? 


Suppose the integer matrix A is equivalent to the reduced matrix 


ay 


336 MODULES OVER EUCLIDEAN DOMAINS 


Prove that |d, {is the gcd of the entries of A. 


16 Show that a 2-by-2 integer matrix A is determined up to equiv- 
alence by | det A| and the gcd of the entries of A. 


6. UNIQUENESS OF CYCLIC DECOMPOSITIONS 


In this section we will complete our discussion of finitely generated mod- 
ules over a Euclidean domain by sketching a proof of the following the- 
orem, which shows that the cyclic decompositions given by Corollary 
5.8 are essentially unique. 


THEOREM 1. Let R be a Euclidean domain, let 7, s, t, u be nonnega- 
tive integers and let d,, ...,d, and e,,...,e, be nonzero nonunits of 
R such that d; divides d;+,, 1 < i<r, and e; divides ej4,;, 1 <i<t If 

(R/Rd,)®...®(R/Rd,) BR* 
is isomorphic as an R-module to 
(R/Re,)®...®(R/Re,) OR", 


then r= t,s =u and d; is an associate of e; for 1 <i<r. 

In order to prove Theorem 1 we must consider various submodules of 
a given R-module M. We will define these submodules and state some prop- 
erties of them. The details will be left as exercises. 

An element x of M is called a torsion element if ax = O for some non- 
zero a Of R. The set of all torsion elements of M will be denoted 7(™M). 

LEMMA 2. The set T(M) is a submodule of M. If M and WN are iso- 
morphic modules, then 7(M) = T(N) and M/T(M) = N/T(N). For any 
R-modules U and V we have 7(U ® V) = T(U) ® T(V) and 


(U@®V)/TUU®@V) =[U/T(V)]@[V/T)]. 


Moreover, 7(R) = {0} and, if d #0, then T(R/Rd)=R/Rd. OU 
For any ain R let aM = {ax|x e M}. 


LEMMA 3. The set aM is a submodule of M. If b isin R, then a(bM) = 
(ab)M. If M and N are isomorphic modules, then aM = aN and M/aM = 
N/aN. For any R-modules U and V we have a(U ® V) = (aU) ® (aV) and 


(U ® V)/a(U BV) = UU/aU) ® (V/aV). 
Moreover, a(R/Rd) = Rf/Rd, where f= gcd(a, d) and 
(R/Rd)/a(R/Rd)=R/Rf. O 


We have already seen a special case of the following lemma in Lemma 
5.4.4. 


UNIQUENESS OF CYCLIC DECOMPOSITIONS 337 


LEMMA 4. Let p be a prime in R. Then R/Rp isa field. 0 
The next lemma is a special case of a more general result. (See Exer- 
cise 2.) 


LEMMA 5S. For all a in R we may give M/aM the structure of a module 
over R/Ra by defining 


(b+ Ra) (x t+aM)=bx+aM OU 


LEMMA 6. Suppose p and d are in R with p prime and d nonzero. 
Set M = R/Rad. If j isa positive integer, then the dimension of (p’— !M)/(p/M) 
over the field F = R/Rp is 1 or 0 according to whether p/ does or does not 
divide d. 


Proof. Since p’M = p(p!—!M), the quotient V = (p/-!M)/(p'M) is a 
vector space over F by Lemmas 4 and 5. By Lemma 3, 
pi-'M = (Re)/Rd, 
pM = (Rf)/Rd, 


where e = gcd(p’—!, d) and f = gcd(p’, d). If p’ does not divide d, then 
e and f are associates and Re = Rf. Thus p’—!M = p’M and V is trivial. If 
p! does divide d, we may take e = p’—! and f= p’. In this case V is iso- 
morphic by the Third Isomorphism Theorem to (Re)/(Rf), which is non- 
trivial and generated by e+ Rf. ThusdimrV=1. O 


Now we can prove Theorem 1. Suppose 
M=(R/Rd,)®...®(R/Rd,) ® R* 
is isomorphic to 
N=(R/Re,)®...® (R/Re,) ® R”. 
Then, by Lemma 2, 
T(M) = (R/Rd,)@®...® (R/Rd,) 
is isomorphic to 
T(M) = (R/Re,)®... ® (R/Re;). 
Also 
RS = M/T(M) =N/TW) = RR”. 


Thus s = u by Corollary 5.2.9. Now let p be a prime and set F = R/Rp. By 
Lemmas 3 and 6, the dimension over F of T(M)/pT(M) is at most r. If we 
choose p dividing d,, which we can do because d, is a nonzero nonunit, 
then the dimension of T(M)/pT(M) is r. Similarly, ¢ is the largest dimension 


338 MODULES OVER EUCLIDEAN DOMAINS 


of T(N)/pT(N) for any prime p of R. But TM)/pT(M) and T(N)/pT(N) are 
isomorphic, and so r = ¢t. Now fix a prime p of R. Forany j = 1 the dimen- 
sion over F of 


pi—+T(M)/p'TM) = p!—! TN) /p1(TN) 


is the number of d; that are divisible by p’ and also the number of e; that 
are divisible by p’. Since d; divides d,,, and e; divides e;41, it follows that 
for each i the elements d; and e; are divisible by the same prime powers. 
Thus d; and e; are associates. LU 


Theorem | has many important consequences. 


THEOREM 7 (Fundamental Theorem of Finitely Generated Abelian 
Groups). Let G be a finitely generated abelian group. There exist unique 
nonnegative integers r and s and unique integers d,,...,d, greater than 
1 with d; dividing d;+, such that G is isomorphic to 


Zz, ®...@Zq, OZ. 


1 


Proof. Existence of 7, s and d,,...,d, is given by Corollary 5.9 and 
uniqueness by Theorem 1. U 


THEOREM 8. Suppose the m-by-n matrix A over R is equivalent to the 
reduced matrix 


ay 


Then rs is unique and d,,...,d, are determined up to associates. 
Proof. We have 


R" /Sp (A) = R"/Sr(D) = (R/Rd,) 8... OB R"~”.: 


Suppose d,,...,@; are units and ¢,4,,...,d, are nonzero nonunits. Then, 
by Theorem 1, m — r and r — t¢ are determined and d,4,,...,d, are de- 
termined up to associates. Therefore t = n — (n — r) — (r— f) is determined 
and, since all units are associates of 1, we see thatd,,...,d; are determined 
up to associates. LU 


The elements d,, . ..,d, of Theorem 8 are called the elementary 


UNIQUENESS OF CYCLIC DECOMPOSITIONS 339 


divisors of A and r is called the rank of A. The rank of the free module 
Sp (A) is called the row rank of A and the rank of the module Sp (A‘) the 
module spanned by the columns of A, is called the column rank of A. 


THEOREM 9. For any matrix A over R the ranks of A and of A? are 
the same and up to associates A and A’ have the same elementary divisors. 
The rank, the row rank, and the column rank of A are all equal. 


Proof. Assume that A is an m-by-n matrix and let A be equivalent to 
the reduced matrix 


Thus D = Py... P,AQ,...Q;, where the P; and Q; are elementary ma- 
trices. Then D’ = Q'... Q{ AP)... Pi. Since the transpose of an ele- 
mentary matrix is elementary, A’ is equivalent to D’, which is reduced. 
Thus A and A? have the same rank and up to associates the same elementary 
divisors. By the proof of Corollary 5.3, there is an automorphism of R” 
taking Sp(A) onto SpW). Thus Sp(A) and SpW) have the same rank. 
Therefore the row rank of A is r. Similarly, the column rank of A, which 
is the row rank of A’, is equal to the rank of D', whichisr. O 


There is one more question related to finitely generated modules over 
Euclidean domains that we should answer. Let M be a module generated by 
X1,++.,Xy and letr:R” —M be the map taking (a,,...,a,)toa,x, +...+ 
a,X,. If we know a matrix A such that Sp (A) is the kernel of 7, we can 
determine the structure of M by finding the reduced matrix 


ay 


340 MODULES OVER EUCLIDEAN DOMAINS 


equivalent to A. We know that M is isomorphic to 
N= (R/Rd,)@®...® (R/Rd,) ® R"~". 


But how can we establish an explicit isomorphism between M and N? 
The matrix D has the form PAQ, where P and Q are invertible ma- 
trices over R. Lett U= Q7!. For 1 <i <n define 


Vi = > U;;X;. 
j=l 
THEOREM 10. Under the assumptions just described, M is the in- 
ternal direct sum 


(Ry,;)®...0 (Ry,). 


If 1 <i <,r, then Ry; is isomorphic to R/Rd; and, if r<i < n, then Ry; is 
isomorphic to R. 

Proof. To simplify the notation in the arguments that follow, let us 
extend the definition of matrix multiplication to cover the product of an 
array over R and an array with entries in M. The result will be an array with 
entries in M. Suppose C = (c,,...,¢C,) 1s a vector in R” and £ is an m-by-n 
matrix over R. If V=(,,...,V,) isin M”, then CV will denote 


Nn 
> Civi, 
j=l 


and EV will be the element of M” whose ith component is 


n 
> Ey; . 


The usual properties of matrix multiplication are satisfied. (See Exercise 4.) 
The vectors X = (x,,...,X,) and Y=(0,,...,¥») are in M” and 
Y = UX. Multiplying on the left by Q = U, we obtain 


QY = Q(UX) = (QU)X = X. 


Thus the x; can be expressed as linear combinations of the y;,so y1,..-,)n 
generate M, Therefore 


M=(Ry,)+...+ Ryn). 


To show that M is the direct sum of the submodules Ry;, we must show that 
whenever we have elements z; in Ry;, 1 <i<n,such thatz, +...+2,= 
O, then z; = 0 for each i. 

If C= (cy,..., C,) isin R", then the image of C under 7 is CX. If 


UNIQUENESS OF CYCLIC DECOMPOSITIONS 341 


C is in the kernel of 7, then C is a linear combination of the rows of A and 
there is a vector B over R such that C = BA. Since the rows of A are in the 
kernel of 7, we have AX = (0,..., 0), the zero element of M”. Now 


DY = (PAQ) (UX) = (PAQU)X = PAX = 0. 


If 1 < i < 7, then the ith component of DY is d;y;, sod;y; =O, i <i<r. 
Suppose z; isin Ry; for 1 <i<nandz, +...+2z, =0. Then z; =c,y; for 
some c; in R. Let C=(c,,...,Cy,). We have 


CY =cC,V, +... t+ CnVyn HZ, +...+2Z, = 0. 


But CY = C(UX) = (CU)X, so there is a vector B over R such that CU = BA. 
Therefore 


C = CUQ = BAQ = (BP~') (PAQ) = (BP™ )D 


and C is a linear combination of the rows of D. Hence c; = O forr<i <n 
and d; divides c; for 1 < i < r. Therefore z; = c,;y; = 0 for all i and 


M=(Ry,)@®... 8 (Ryp). 


Suppose c is in R. If 1 <i <r, then cy; = 0 if and only if d; divides c, so 
Ry; is isomorphic to R/Rd;. If r <i <n, then cy; = 0 only if c = 0, so Ry; is 
isomorphic toR. OU 


Let us illustrate Theorem 10 with an example in which R = Z. Suppose 
G is an abelian group of order 64 that we know is generated by elements 
a,b, and c. Assume also that the following equations hold in G. 


Ja+5b+ c =0, 
a+ 3b-— c =Q, 
3a — 7b +9c =0. 
Let 7:Z°>—>G map (i,j, k) to iat+ jb +kc. Then the rows of 


O<A<+3 30975113 13 979 


7 5 1 
1 3 4 
3 °7 9 


are in the kernel of 7. Since 4 has determinant 64 = |Gl, 


ZDET A 
64 


S7(A) has index 64 in Z?, so the kernel of 7 is Sz (A). From 


342 MODULES OVER EUCLIDEAN DOMAINS 


U<+D<+ZREDUCE A 


1 O O 
O 44 0 
O O 16 


we see that G = Z, ® Z,,. Now ZREDUCE computed two matrices RF and 
Sin GL3(Z) such that DisR+.xAt+.xS. 


R 2 
1 0 0 0 0 1 
2 7 1 0 1 1 
5 13 2 1 5 12 
Let 
D+U+ZMATINV § 
7 5 1 
1 1 #0 
1 OO 0 


According to Theorem 10, if we set 
x=Tat+5bte, 
y=-atb, 
zZ=a, 
then G is the internal direct sum <x > ®<y>® <z > and xX, y, and 


z have orders 1, 4, and 16, respectively. This means x = O, and so G = 
<yr>@<z>. 


EXERCISES 


1 Prove Lemmas 2, 3, and 4. 


2 Suppose M is an R-module and / is an ideal of R such that au = 0 
for all a in J and all u in M., Prove that defining (b + Ju = bu makes 
M into a module over S = R/I. Derive Lemma 5 as a corollary. 


3 For any matrix A over R and any positive integer k let J, (A) be 
the ideal of R generated by the determinants of the k-by-k subma- 
trices of A. Show that if A and B are equivalent, then /,(A) = 
I,(B). Show also that if D is a reduced matrix with nonzero di- 
agonal entries d,, ...,d,, then J,(D) is the ideal generated by 
d,...d, ifk < rand/,(D)= {0} if k >r. Give an alternate proof 
of Theorem 8. 


SOLVING LINEAR SYSTEMS USING ROW AND COLUMN OPERATIONS 343 


4 Let M be an R-module. If C=(c,,...,c,) is in R” and V= 
(v1,+++,'m) is in M” , define CV to be the element 


Nn 
> CiV;. 
i=1 


Extend this definition to give the product AW of an array A over 
R and an array W with entries in M, provided the dimension of A 
along the last axis is the same as the dimension of W along the first 
axis. Let A and B be arrays over R and let U and W be arrays over 
M. Prove the following identities, assuming the indicated products 
are defined. 


(A+B)W=AW+BW, 
A(U+W)=AU+AW. 


Show that A(BW) = (AB)W provided B has rank at least 2. 


5 Let G be an abelian group generated by elements a, b, c, d and 
suppose the kernel of the map (i,j, k, 2) ia + jb + kc + &d of 
Z* onto G is Sz(A), where 


Show that G = Z, ® Z,, © Z. Find elements x, y, and z of G such 
that x has order 3, y has order 12, and G is the internal direct sum 
<x >O<yr@<ze. 


7. SOLVING LINEAR SYSTEMS USING 
ROW AND COLUMN OPERATIONS 


The method of solving linear systems given in Section 4 is adequate for sys- 
tems over fields but not for systems over other rings such as Z. In this sec- 
tion we will describe additional techniques that are sufficient for solving 
systems over any Euclidean domain. These techniques involve column 
operations as well as row operations. In addition to Theorem 4.4, we will 
need the following result. 


THEOREM 1. Let A be an m-by-n matrix over a commutative ring 
R, let B be in R™, and let QO be in GL,,(R). Set A’ = AQ. There is a 1-1 
correspondence between the solutions of AX = B and the solutions of 
A'Y =B. This correspondence is given by X = OY and Y=Q1 X. 


Proof. Suppose AX = B. Then A’(Q7?X) = (AQ) (Q7X) = AX =B. 


344 MODULES OVER EUCLIDEAN DOMAINS 


Thus Q-1X is a solution of A’Y = B. Conversely, if A’Y = B, then A(QY) = 
(AQ)Y =A’Y=B, and so OY isa solution of AX=B. OD 


The next theorem is the main result of this section. 


THEOREM 2. Let AX = B be a linear system with m equations in n 
variables over a Euclidean domain R. Suppose A is equivalent over R to the 
reduced matrix D with elementary divisorsd,,...,d, Let D = PAQ, where 
P is in GL, (R) and Q is in GL, (R). Set B’ = PB = (b,,..., bm). The sys 
tem AX = B is consistent if and only if b; = 0, r <i <n, and d; divides 
b;, 1 < i <r. If this is the case, then a solution is given by C = QE, where 
E = (b'//d,,...,0,/d,,0,...,0). The last nm — r columns of Q are a basis 
for the module of solutions of AX = 0. 


Proof. The system DY = B’ consists of the equations d,u; = b;, 1 < 
i<r,and 0 = b;,r <i <n. Clearly, this system is consistent if and only if 
d; divides b;, 1 <i < r, and b; = 0,r <i <n. If this condition is satisfied, 
then the vector E = (b|/d,,...,0,/d,, 0,...,0) in R® satisfies DE = B’. 
Now D = PAQ and B’ = PB, so Theorem 44 tells us that DY = B’ and 
(AQ)Y = B have the same solutions. By Theorem I, there is a 1—1 cor- 
respondence between the solutions of (AQ)Y = B and those of AX = B. 
Therefore AX = B is consistent if and only if d,; divides b;, 1 < i < r, and 
b;=0,r <i <n. In this case QE is a solution of AX = B. 

The solutions of DY = O are the vectors Y = (v,,...,¥») such that 
y, =... =y, = 0. Thus the last nm — r standard basis vectors for R” span 
the set of solutions of DY = 0. The product of Q and the ith standard basis 
vector of R” is Q[;i], so the last m — r columns of Q are solutions of AX = 
O. If X in R” satisfies AX = 0, then Y=Q71X satisfies DY = 0. Thus the 
first r components of Y are 0, and X = QY is a linear combination of the 
last nm — r columns of Q. Since Q is invertible, the columns of Q are linearly 
independent, and the last m — r columns of Q area basis for the set of solu- 
tionsof AX=0. 0 


Let us use Theorem 2 to solve the following system over Z. 


—5x+ 2y+ 42+ w= 8, 
27x+10y+ 2z+7w= 6, 
—20x — 6y — 4w = —10. 


Entering the matrix of coefficients and the vector of constant terms 


O<A<+3 4p 5 241271027 ° 20 60 4 
5 2 Mm 1 
27 10 2 7 
20 £6 Oo 4 
B<+8 6 10 


SOLVING LINEAR SYSTEMS USING ROW AND COLUMN OPERATIONS 345 


and reducing the coefficient matrix, 


O<+D<+ZREDUCE A O+B1i<R+.xB 
100 0 8 22 6 
020 0 S 
00 6 0 0 0 0 1 
R 0 1 8 4 
100 0 0 1 3 
4 01 1 2 12 1 
112 


we find that, in the notation of Theorem 2, d, = 1, d, = 2,d, = 6, and 
B’ is (8, 22, —6). Since d; divides b;, 1 < i < 3, we get the solution 


_O«C+S+.x8 11 1 0 
019 1 26 

A+.xC 
8 6 10 


Any solution of the corresponding homogeneous system is an integer mul- 
tiple of the last column of S. 
Theorem 2 has an important corollary. 


COROLLARY 3. Let A be an m-by-n matrix over a Euclidean domain 
R, If m <n, then the system AX = 0 has a nontrivial solution. 


Proof. Let A be equivalent over R to the reduced matrix D with ele- 
mentary divisors d,, ..., d,. By Theorem 2, the solutions of the system 
AX = 0 form a submodule of R” of rank n — r. Since r <m <n, it follows 
that n — r is positive, so nontrivial solutions exist. L 


There are procedures in CLASSLIB for solving linear systems over 
the rings R, Z, and Z,. For example, if A is an integer matrix and B is an 
integer vector of length 1toA, thenC<A ZZISYS B is a solution to the 
system A/B=A+.xX over Z, provided this system is consistent. In addition, 
ZLSYS constructs a global variable W, which is an integer matrix whose 
rows are a basis for the solutions of the homogeneous system 4/0=A+. xX. 
Using ZL SYS to solve the linear system over Z discussed previously, 


A A ZLSYS B 
5 2 m 1 019 1 26 
27 10 2 7 W 
~20 6 oO 4 1 “4 3 4 
B 


346 MODULES OVER EUCLIDEAN DOMAINS 


we get the solution obtained earlier. 


The procedure RLSYS solves linear systems over R using the methods 
of Section 4. It is used the same way as ZLSYS. 


A RLSYS B W 
26 85 77 0 1 4 3 1 
A+.x26 85 77 0 
8 6 10 


To solve linear systems over Z,, m a prime, the procedure ZVNLSYS may be 
used. 


EXERCISES 


1 Let R be an integral domain and let M be the submodule of R” 
consisting of all solutions of some homogeneous linear system over 
R. Show that if u is in R” and for some nonzero a in R the vector 
au is in M, then u is in M. Exhibit a submodule of Z? that is not the 
set of solutions of any homogeneous system in two variables over Z. 
2 Let V be a subspace of Q” of dimension m. Show that Z”/NV is a 
Z-module of rank m. 
3 Solve each of the following linear systems over Z. 
(a) 2x, —x,=4 
5x, — X, =9 
(b) —X1 + 4x. =7 
3x, — 13x, = —23 


(c) 5x, — 2x, —- 29x3= 43 
4x, - x, - 22x3 = 34 

2X, + XxX, - 8x3 = 18 

(d) 4x, + X3— 2X4g= 9 
—2x, — 8x3 +1lx, =—-—16 
XxX, t X2 + 5x3 - 7x, = 12 


4 LetR=Z,;[X]. Solve the system A Y = 0 over R, where 
1+X 4+X? 4+2Xx? 
A= X 34+ X+4+X* 14+2X 
X* 1+X+2X? 24+X+X3 
5 Let V=SgQ(A), where 


SOLVING LINEAR SYSTEMS USING ROW AND COLUMN OPERATIONS 347 


3/2 1/3 2 7/8 
A= 1 1/5 1/2 2 
4/3 9/2 -l1 —2/5 


Determine a Z-basis for V Z*. (Hint. Find an integer matrix 
C such that V is the set of solutions of the linear system CX = 0 
over Q.) 

*6 Show that Corollary 3 is true when R is any commutative ring. 


7 Let f be the element of Endz(Z* ) whose matrix with respect to the 
standard basis is 


2 —4 9 J 
—5 —3 —-—16 —9 
3 —| 11 4 
7 —] 25 10 


Find a basis for the kernel of f and decide whether (8, 0, 22, 12) 
is in the image of fF. 


8 Find a Z-basis for Sz(4) 1 Sz(B), where 


3 -l 2 4 2 5 -3 8 
A=| 5 2 —7 1], B=] 7 | —6 
—2 4 -l 3 3 —4 5 3 


(Hint. See Exercise 4.9.) 


FIELDS 


So far in this book we have encountered relatively few fields. We have 
worked extensively with Q, R, C, and Z,, p a prime. In Section 4.8 we de- 
fined fields of rational functions, and in Section 5.4 we constructed a field 
with nine elements. However, there are many other fields, and it is important 
to devote some time to studying the structure of fields. One valuable result 
of this study will be a complete description of all finite fields. Throughout 
this chapter, F will be a field. 


1. EXTENSION FIELDS 


Let F and K be fields with F contained in K. If we think of K being fixed 
and F varying, we usually say that F is a subfield of K. However, in this 
section we will be looking at a given field F and investigating the properties 
of fields that contain F. To emphasize this point of view, we will call K an 
extension field, or simply an extension, of F. (The terms ‘“‘superfield’’ or 
“‘overfield’’ might also be used to describe the relation of K to F.) Since 
K contains F, it follows that K is an algebra over F and, in particular, K is 
a vector space over F. If K is finite dimensional over F, we say K is a finite 
extension of F. The degree of K over F is dime K, which we denote by 
[K:F]. 


THEOREM 1. If K is a finite extension of F and L is a finite extension 
of K, then L isa finite extension of F and [L:F] =[L:K] [K:F]. 


Proof. Let x,,...,Xm, bea basisof K over F and lety,,...,¥, bea 
basis of L over K. We will show that the mn products z;; = x;y; form a basis 
for L over F. 

First, let us prove that the z,;; span L. If w is an element of L, then there 
exist elements u,,...,U, in K such thatw =u,y, +...+u,y,. Since 
each u; is in K, there exist elements v,; in F such that uj; = yyj;x, +... + 
Vm jXm. Then 


w= > Uy; = > (> vim) Yi > Vij 2ij- 
J J r tj 


348 


EXTENSION FIELDS 349 


Therefore the z;; span L over F. 
All that remains is to show that the z,;; are linearly independent. Sup- 
pose 


O= > ViiZ yj = ~ (= "i Vj, 
tj 1 


where the v,; are in F. Set uj =v4j;xX, +... +VmjXm. Then each u; is in K and 


U;yy,; +. ..+t+Uny, = O. Since the y; are linearly independent over K, this 
means that each u; is 0. But the x; are linearly independent over F, so 0 = 
VijX, +... +VmjXm implies that each v;; = 0. Thus the z,;; are linearly inde- 


pendent over F. O 


COROLLARY 2. Let L be a finite extension of F and let K be a field 
with F C K CL. Then K isa finite extension of F and L is a finite extension 
of K. Thus [L:F] =[L:K] [K:F]. 


Proof. We are given that L is a finite dimensional vector space over 
F. Since K is a subspace of L, it follows that K is finite dimensional. If 
Z1,++.+,2n Span L over F, then z,,...,2Zn certainly span L over K. There- 
fore L is a finite extension of K and Theorem | now applies. 


The field C is an extension of R. If i? = —1, then 1, 7 is a basis for 
C over R,so [C:R] = 2. 

Let K be an extension of F and let a be an element of K. Since K is 
an algebra over F,, we have the evaluation map defined in Section 5.4 taking 
F[X] to K with a polynomial f being mapped to f(a). This map is an algebra 
homomorphism whose kernel J, is an ideal of F[X]. If fis in J,, then 
f(a) = 0, and we say a satisfies f or a is a root of f. The image of F[X] 
under the evaluation map at a is the smallest subring of K containing both 
F and a and is denoted Fla]. The intersection of all subfields of K that 
contain F and a is again a subfield of K containing F and a. This subfield 
is denoted F(a). It is the smallest subfield containing F and a. We also call 
F(a) the subfield obtained by adjoining a to F. Clearly, Fla] C F(a). 

The element a of K is algebraic over F if J, # 0. Thus a is algebraic 
if and only if f(a) = 0 for some nonzero polynomial f in F[X].If ais not a 
root of any nonzero polynomial in F[X], we say a is transcendental over F. 
The real number,/2 is algebraic over Q, since,/2 is a root of the polynomial 
X* — 2 in Q[X]. It can be shown that the real numbers 7 and e are trans- 
cendental over Q. Here m7 = 3.14159. . . is the ratio of the circumference of a 
circle to its diameter and e = 2.71828. . . is the base of the natural log- 
arithms. 

Suppose a is algebraic over F. Then J, is a nonzero proper ideal of 
F[X] and therefore consists of all multiples of a unique monic poly- 


350 FIELDS 


nomial f of positive degree. We call f the minimal polynomial of a over F. 


THEOREM 3. Let a be an element of an extension field K of F and 
Suppose a is algebraic over F. Then the minimal polynomial f of a over F is 
irreducible. Moreover, F(a) = F[a] and F(a) is isomorphic to F[X]/J/,. The 
degree [F(a):F'] is equal to the degree of f 


Proof. Suppose a is algebraic over F. If the minimal polynomial f of 
a factors as f = gh in F[X], then 0 = f(a) = g(a)h(a). Therefore either g(a) = 
O or h(a) =0. Let us say g(a) =0. Then g is in J,; thus f divides g and / is a unit 
in F[X]. Therefore f is irreducible in F[X]. By Corollary 5.4.4, FLX] /J, 
is a field. But F[X]/J, is isomorphic to F[a] and hence F[a] is a field. 
Since F and a are both contained in Fla], this means F(a) C Fla]. But 
then F(a) = Fla]. Let n be the degree of f. By Theorem 5.4.1, the dimen- 
sion of F(a) over F isn. That is, [F(a):F] =n. OU 


The following result summarizes some important properties of alge- 
braic elements. 


THEOREM 4. Let a be an element of an extension field K of F. Then 
the following are equivalent. 


(a) ais algebraic over F. 

(b) F(a) is a finite extension of F. 

(c) F[a] is finite dimensional over F. 
(d) F{a)= Fla]. 

(e) Eithera=Oora™ isin Fila]. 


Proof. By Theorem 3, condition (a) implies condition (b). If condi- 
tion (b) holds, then F(a) is finite dimensional and contains F[a]. There- 
fore condition (c) holds. If condition (c) holds and n = dim F|[a], then the 
n+ 1 elements 1, a,...,a” cannot be linearly independent over F. This 
implies that condition (a) is true. Thus conditions (a), (b), and (c) are equiv- 
alent. 

We complete the proof by showing that conditions (a), (d), and (e) 
are equivalent. Theorem 3 shows that condition (a) implies condition (d). 
If condition (d) holds, then Fla] is a field, and so condition (e) must be 
true. Finally, suppose condition (e) holds. If a=0, then a is a root of the 
polynomial X in F[X]. If a #0, then a is in F[a]. Thus there 
exist Co,..-,C, in F such that 


a'=ceg +cyat...+c,aQ”. 


Multiplying by a, we get 


l=coatc,a*7+...+c,a"*!, 


EXTENSION FIELDS 351 


and so a is a root of c,X"*! + ...+c 9X — 1. Therefore, in either case, 
aisalgebraic. O 


Let us consider an example. The polynomial f = 1 — X + X? is irre- 
ducible in Q[X]. Let a be a root of f in C and set K = Q(a). Then f is the 
minimal polynomial of a over Q and [K:Q] = 3. Let b = a’. Since b is in 
K, the elements 1, b, b?, b> cannot be linearly independent over Q. Thus 
there exist rational numbers x, y, z, w, not all 0, such that x + yb + zb? + 
wb? = 0. Thus b is algebraic over Q. To find the minimal polynomial of 
b, we need to know all rational solutions x, y, z, w of the equation x + 
yb+zb? +wb? =0. Now a? =a— 1,s0 


b* =a* =a(a?)=a(a — 1)=a? —a, 
b? =a® = (a3)? =(a— 1)? =a? — 2a+ 1. 
Therefore 
O=x+t+yb+zb? +wb? =x + ya? +z(a* — a)+ wa? — 2a + 1) 
=(x+tw)l —(z+ 2w)at(y+z+w)a? 
Since 1, a, a? are linearly independent over Q, we must have 


x + w=0O, 
z+2w=0O, 
ytz+ w=0. 


Using the techniques of Section 6.4 to solve this homogeneous system, 
we find that (x, y, z, w) must be a rational multiple of (—1, 1, —2, 1). 
Thus 5 is a root of g = —1 + X — 2X? + X?3 and b is not a root of any 
nonzero polynomial in Q[X] of degree less than 3. Therefore g is the mini- 
mal polynomial of b over Q. 

The following theorem is an important consequence of Theorem 4. 


THEOREM 5. Let LZ be an extension of F. The set K of all elements 
in L that are algebraic over F is a subfield of L containing F. 


Proof. First, we note that if a is in F, than a is a root of the poly- 
nomial X — ain F[X], soa is in K. Now let a and b be any two elements 
of K. Then F’,, = F(a) is a finite extension of F. Also, since b is a root ofa 
nonzero polynomial in F[X] and F[LX] C F,[X], the field F, = F, (0d) is 
a finite extension of F,. Therefore F, is a finite extension of F. Suppose 
c is any one of the elements a + b, ab, or (if a #0) a7. Then c is in F, 
and F[c] C F,. Therefore F[c] is finite dimensional over F and c is alge- 
braic over F by Theorem 4. Thus K isa subfield of LZ. U 


Ifa,,...,4, are elements of an extension field K of F, then F(a,,.. 
a,,) is defined recursively by the formula 


an 


352 FIELDS 


F(a,,-++,4n)=FQ,,.++,4%_1) Gy). 


In the exercises an alternate definition is discussed. In particular, it can be 
shown that F(a,,...,4@,) does not depend on the order of the a;. If each 
a; is algebraic over F, then a; is algebraic over F(a,,...,@;_ 1) and F(a,,..., 
a;) is a finite extension of F(a,,...,4;_1). By induction, F(a,,...,@,) 1s 
a finite extension of F. 

We say that an extension K of F is an algebraic extension if every ele- 
ment of K is algebraic over F. 


THEOREM 6. Let K be an algebraic extension of F and let LZ be an 
algebraic extension of K. Then L is an algebraic extension of F. 


Proof. Let u be an element of L. Then uw is a root of a monic poly- 
nomial dg +.a,X +...t+a@,_,X"~! +X" in K[X]. Let M=Fl(ap,..., 
a, —1). Each a; is in K and thus is algebraic over F. Therefore M is a finite 
extension of F. Now u is algebraic over M, so M(u) is a finite extension of 
M and hence of F. Since F(u) C M(u), F(u) is a finite extension of F, and 
u is algebraic over F. OU 


The set of all complex numbers that are algebraic over Qis an algebraic 
extension of Q that is not a finite extension. 

Let f be an element of F[X] with positive degree m and suppose K is 
an extension of F. By Theorem 4.9.8, the number of distinct roots of 
f in K is at most n. If a is in K, then, by Theorem 4.11.7, a is a root of 
f if and only if X — a divides f in K[X]. Ifa is a root of f, then the mul- 
tiplicity of a as aroot of f is the largest integer m such that (XY — a)” divides 
f. A multiple root of f is a root of multiplicity greater than 1. Ifa,,...,4, 
are the distinct roots of fin K and a; has multiplicity m;, then 


f=g I (X —a,)”¢, 


where g is a polynomial in K[X] having no roots in K. We will call 
m, +...+m, the number of roots of fin K counting multiplicities. This 
number is at most 7. 

Theorem 4.9.8 puts a significiant restriction on the structure of the 
multiplicative group of a field. 


THEOREM 7. Let A be a finite subgroup of the multiplicative group 
of a field. Then A is cyclic. 


Proof. By Corollary 6.5.9, A is isomorphic to a direct sum 
B=2Z,,0...02Z4,, 


where each d; > 1 and d; divides d;+;, 1 < i<r. For any x in B we have 


SPLITTING FIELDS 353 


dx = 0, where d = d,. This means that a? = 1 for all a in A. (We are using 
additive notation in B and multiplicative notation in A.) Thus every ele- 
ment of A is a root of the polynomial X? — 1. By Theorem 4.9.8, the 
number of roots of X% — 1 in a given field is at most d. Theorefore d, = 
d>|A|=d,...4d,. This is impossible ifr > 1, sor=1 and A is cyclic. U 


COROLLARY 8. The multiplicative group of a finite field is cyclic. 


Proof. Clearly, the multiplicative group of a finite field is finite, so The- 
orem 7 applies. 0 


EXERCISES 


1 Let K be a finite extension of F and suppose [K:F] is a prime. 
Show that there are no subfields of K that contain F except F 
and K. 


2 Let K = Q(a), where a is a root of 1 — X + X? in C. Find the 
minimal polynomials of a+ 1 and a? +a. 


3 The polynomial f= 2 + X + X? is irreducible in R = Z3[X]. Thus 
F = R/Rf is a field with nine elements. Find a nonzero element 
of F whose multiplicative order is 8. (The procedures with prefix 
ZNX F discussed in Section 5.4 may be of use.) 


4 Given that m is transcendental over Q, show that ,/m and a? are 
transcendental over Q. 

5 Let K be the field of rational functions in one variable over Q. 
Show that any element of K that is algebraic over Q belongs to Q. 


6 Find generators for the multiplicative groups of the fields Z,, 
p=2,3,5,7,11. 


2. SPLITTING FIELDS 


Let f be a polynomial in F[X] of positive degree nm and let K be an exten- 
sion of F. The number of roots of f in K counting multiplicities is at most 
n. In this section we will show that it is possible to choose K so that the 
number of roots of fin K counting multiplicities is exactly n. 

First, we must show that we can construct extensions in which a given 
irreducible polynomial has a root. 


THEOREM 1. Let f be a monic irreducible polynomial in F[X]. There 
exists an extension field K of F and an element a of K such that 

(a) K = F(a). 

(b) a is algebraic over F with minimal polynomial f. 


354 FIELDS 


Proof. Let J be the ideal of all multiples of f in F[X]. Then, by The- 
orem 1 and Corollary 4 of Section 5.4, the quotient K = F[X]/J is a field 
and an F-algebra of dimension n = deg(f). Now F is not strictly a subfield 
of K. However, we may identify an element c of F with cl in K and con- 
sider F to be contained in K. Let a be the coset of J in F[X] containing 
X. That is,a= X + J, Then 


Fa@AafK(X+HDa=fX)+l =f, 


since f is in J But J is the zero element of K, so f(a) = O in K. Therefore 
the minimal polynomial of a divides f. Since f is monic and irreducible, 
f must the the minimal polynomial of a over F. Finally, [F(a):F] =n = 
[K:F],soK=F(a). OU 


Now we can handle an arbitrary polynomial in F[X]. 


THEOREM 2. Let f be a polynomial of positive degree n. There exists 
an extension K of F and an extension L of K such that 


(a) fhasa root in K. 

(b) The number of roots of fin Z counting multiplicities is vn. 
(c) [K:F] <a. 

(d) [L:F] <n!. 


Proof. Let g be a monic irreducible factor of f in F[X]. By Theorem 
1, there is an extension K of F such that g has a root a in K and [K:F] = 
deg(g) < n. The element a is a root of f and, in K[X], we can write f= 
(X — a)h, where deg(h) = n — 1. By induction, we can find an extension 
L of K such that the number of roots of h in ZL counting multiplicities is 
n — 1 and [L:K] < (n —1)!. Then [L:F] <n! and the number of roots of 
fin L counting multiplicities ism, O 


Let f be a polynomial of positive degree n in F[X].A splitting field 
for f over F is an extension L of F such that f has n roots in L counting 
multiplicities and L = F(a,,...,4,), where a,,..., 4, are the roots of 
f in L with a root of multiplicity m repeated m times. In L[X] we can write 


fee ll x —a), 


where c is in F, but for no proper subfield K of L containing F can f be 
written as a product of linear polynomials in K[X]. Theorem 2 guarantees 
the existence of a splitting field Z for f over F with [L:F] <n! It is con- 
ceivable that f could have many splitting fields with very different struc- 
tures. However, it is a fact that all splitting fields of f are isomorphic as 
F-algebras, but we will not prove this result here. 


SPLITTING FIELDS 355 


Let us consider an example. The polynomial f = X? — 2 is irreducible 
in Q[X]. If a =./2, the real cube root of 2, and w = (—1 +4/3i)/2, then the 
roots of f in C are a, wa, and w?a. The field L = Q(a, wa, wa) is a splitting 
field for f over Q. What is [Z:Q]? Since f is irreducible, [Q(a):Q] = 3 and 
3 divides [L:Q] by Theorem 1.1. Now L = Q(a, w) and w satisfies the 
polynomial 1 + X + X? = 0. Thus [L:Q(a)] < 2, so [L:Q] = 3 or 6. Now 
Q(a) is contained in R, but w is not in R. Therefore L # Q(a) and [L:Q] = 6. 

It is important to be able to tell whether a polynomial has multiple 
roots in a splitting field without actually constructing a splitting field. In 
order to do this, it turns out to be useful to define the derivative of a poly- 
nomial with coefficients an arbitrary field. In calculus the derivative is 
defined as a limit. This approach is not possible in our setting. However, 
in calculus it is shown that if ff=a) ta,X +...+a,X” is in R[X], then 
the derivative f’ of f is given by the formula 


f =a, +2a,X+...+na,X"7!. (*) 


This formula makes sense even when the coefficients a; are not real num- 
bers but lie in any field. We will take (*) as the definition of f’ for f = 
ag ta,X+...+a,X" in F[X]. Clearly, f’ is again in FLX]. 

LEMMA 3. If fand g are in F[X] and c isin F, then 

(a) F+ay=f +e’ 

(b) (cf) =cf’. 

(c) (fg)’ =fet fe’. 

Proof. These formulas are familiar from calculus. Exercise 2 outlines 
an approach to proving their correctness in this more general situation. LU 


The next result shows why we have introduced derivatives. 


THEOREM 4. Let f be a polynomial in F[X] of positive degree and let 
K be a splitting field of f over F. Then f hasa multiple root in K if and only 
if f and f’ have a common factor of positive degree in F[X]. 


Proof. Suppose first that f has a multiple root a in K. Then (X — a)? 
divides f in K[X] so we can write f = (X — a)*g, where g is in K[X]. Thus 


f =2(X —a)g+(X —a)’*g’, 

and X — a is a common factor of fand f’ in K[X]. But could it still happen 
that f and f’ have no common factor of positive degree in F[X]? No. We 
can compute the greatest common divisor of f and f’ using the Euclidean 
algorithm, and the result is the same whether we consider the polynomials 
to be in F[X] or K[X]. Thus f and f’ cannot be relatively prime in FLX] 
and have the common factor X — a in K[X]. Therefore f and f’ have a 
nontrivial common factor in FLX]. 


356 FIELDS 


Now suppose that h is a common factor of f and f’ in F[X] of posi- 
tive degree. The roots of h are roots of f,so h has aroota in K and a is also 
a root of f’. Let f= (X — a)g, where g is in K[X]. Then 


fi=gt+(X —a)g’. 


Since X — a divides f’, it follows that X — a divides g and thus (X — a)’ 
divides f Therefore a isa multiple rootoffi O 


Let us consider an example. Suppose f = 25 — 30X + 19X? — 6X3 + 
X* in R{X]. Then f’ = —30 + 38X — 18X? + 4X3. Computing gcd(f, f’), 


F<25 3019 61 
Fi<« 30 38 18 4 
F RXCGD F1 

5 381 


we find that f and f’ have a nontrivial common factor, and so f has mul- 
tiple roots in any splitting field. 

In R[X] the only polynomials whose derivatives are O are the constant 
polynomials. This result from calculus does not carry over to #[X] in gen- 
eral. For example, if f= 1+ X? + X* in Z,[X], then-f’ = 2X + 4X? =0 
as2=QOinZ,. 

THEOREM 5. Let f be a polynomial in F[X] and suppose f’ = 0. 


Then either f is a constant polynomial or the characteristic p of F is a prime 
and f has the form g(X®? ) for some g in F[X]. 


Proof. Let f=ay) +a,X+...+a,X". ThenO=f' =a, + 2a,X+...+ 
na,X"—'. Therefore ia; = 0, 1 < i < n. If the characteristic p of F is 0, 
then for 1 <i < n we have i #0 in F, so a; = 0. Therefore f = ag is a con- 
stant polynomial. If p is a prime, then 7 is still not O in F unless p divides 
i. Therefore a; = 0 if p does not divide i and 


f=ao tapX? +a,,X7? +... =8(X*), 


where g=dy +a,X tan,X* +.... O 


COROLLARY 6. Let f be an irreducible polynomial in F[X] and sup- 
pose f has a multiple root in its splitting field. Then the characteristic p of 
F is a prime and f= g(X” ) for some g in F[X]. 


Proof. Since f has a multiple root, f and f’ have a common factor of 
positive degree. Since f is irreducible, the only factors of f of positive de- 
gree are associates of f. Therefore f divides f’. But deg(f’) < deg(f), and so 
f' must be 0. By Theorem 3, p isa prime and f=g(X?). O 


FINITE FIELDS 357 


EXERCISES 


1 Let f have positive degree m in F[X]. Suppose K is any splitting 
field of f over F. Show that [K:F] <n!. 


2 Let D:F[X]-—~F[X] be the derivative map. Show that D is a linear 
transformation, thereby establishing parts (a) and (b) of Lemma 3. 
Next prove part (c) for the special case f =X’, g = X’. Finally, 
prove (c) for all f and g. 


3 Show that the chain rule [f(g)]’ = f'(g)g’ holds in FLX] for any 
field F. 


4 For each of the following polynomials f let L be the subfield of 
C generated by Q and the roots of f. Determine [L:Q]. 
(a) —2+X—2X?7+X°?. 
(b) 2+2X+X?3, 
Fic) 1—3X+X3, 


5 Let F be a vector listing the coefficients of a polynomial f in 
R[X]. Write an APL expression for the vector of coefficients of 
f'. Assume OTTO is 0. 


6 For each of the following polynomials f in Q[X] determine 
whether f has a multiple root in a splitting field. (Is it “‘safe’’ to 
use RXGCD to compute the greatest common divisor of f and 
f'?) 

(a) 2—X+3X?7 —2X7 4+ X44 X35. 
(b) 75 —5X+23X?7 +5X7 4+X44X5, 
(c) 6+23X + 24X72 + 15X73 +6X4*+X5. 


3. FINITE FIELDS 


In this section we will give a complete description of all finite fields. If 
F is a finite field, then the characteristic of F is a prime p and F contains 
a subfield isomorphic to Z,. We may consider F’ to be an extension of Zp. 
If [F:Z,] =n, then |F'| = p”. Our main result will be that for each prime 
power p” there is, up to isomorphism, exactly one field with p” elements. 
We will construct finite fields as splitting fields of certain polynomials in 
Z,(X]. 


THEOREM 1. Let p be a prime and let K be an extension of Z, of 
degree n. Then K isa splitting field for X?" — X over Z, and K has the form 
Z,(a) for some a in K. If f is an irreducible element of Z,[X] of degree 
n, then K is a splitting field for f over Z, and f divides X?” — X. 


Proof. Let a be a nonzero element of K. The multiplicative group of 


358 FIELDS 


K has order p” — 1, so a?™~" = 1 orae"'-1=0. Multiplying by a, we 
get a” —a=0,s0a isa root of h = XP” — X. Clearly, 0 is also a root of 
h. Therefore every element of K isa root of h. There are p” = deg(h) roots of 
h in K, so there cannot be any more roots of A in an extension field of 
K. Since K is certainly generated as an extension of Z, by the roots of 
h, it follows that K is a splitting field for h. If we choose a to be a generator 
of the multiplicative group of K, which is cyclic by Corollary 1.8, then 
K = Z, (a). 

Now suppose f in Z,[X] is irreducible of degree n. Let L be a splitting 
field for f over K and let b be a root of f in L. Then M = Z, (b) is an exten- 
sion of Z, of degree n and so, by the previous argument, every element of 
M is a root of h, But the roots of h in L are elements of K, so b is in K. 
Therefore L = K and K = Z,(b). Thus K is a splitting field for f over Zp. 
Since f is the minimal polynomial for b and b is a root of h, we see that 
fdividesh, OU 


COROLLARY 2. For all a in Z, we have a? — a = 0. Moreover, in 
Z,[X], 
XP —~ X= X(X — 1) (X —2)...(X —pt). 
Proof. By Theorem 1, with n = 1, every element of Z, is a root of 


h =X? — X, Therefore g = X(X — 1)... (X — p + 1) divides h, But g and 
h have the same degree and the same leading coefficient, sog=h. U 


THEOREM 3. Let p be a prime and n a positive integer. There exist 
fields with p” elements, and any two such fields are isomorphic. 


Proof. Let h = X?” — X and let L bea splitting field for A over Ly - 
Since h’ = —1, we know by Theorem 2.2 that A has no multiple roots in 
[, so the set K of roots of 4 in L has p” elements. Suppose a and 5b are in 
K. Then 


(ab)P” = ah" bP” =ab 
and ab isin K. By the Binomial Theorem, 
= 7? P\ .p-1i Dp ~2p2 p 

(at+bp aah + (7) arb + (4) 2b? +... + bP. 

Now 
(?) _p(p—1)...@—-it |) 
i i! 
is divisible by p unless i = O ori = p. Since L has characteristic p, we have 
(a+b =a? +b? . 

By induction, it follows that 


(a+ be” =qh" +b?” =atb, 


FINITE FIELDS 359 


and so a + b is in K too. By Corollary 2, every element of Z, is in K, so 
K is a subfield of L. But L is generated as an extension of Z, by the roots 
ofh,so L = K isa field with p” elements. 

Now let M be any other field with p” elements. We may assume ™ is 
an extension of Z,. By Theorem 1, there is an element b such that M = 
Ly (b). Let f be the minimal polynomial of b in Z,[X]. Again by Theorem 
1, there is a root c of f in K. Since fis irreducible, we must have K = Z,(c). 
But, by Theorem 1.3, this means that 

K=Z,[X)I=M, 
Where / is the ideal of all multiples of f in Z,[X]. Thus all fields with p” 
elements are isomorphic. U 


COROLLARY 4. If p is a prime and 7 is a positive integer, then there 
exist irreducible polynomials of degree n in Z, [X]. If f is such an irreducible 
polynomial, then f divides X?" — X and f has no multiple roots in a splitting 
field. Every irreducible factor of X?” — X has degree at most n, 

Proof. By Theorem 3, there is a field K with p” elements and, by 
Theorem |, there is an element a of K such that K = Z,(a). The minimal 
polynomial for a over Z, is an irreducible polynomial of degree n in Z, [X]. 
If f in Z,[X] is irreducible of degree n, then f divides h = xe” _ *y by 
Theorem 1 and, since # has no multiple roots, neither does f, Let g be an 
irreducible factor of X°” — X. Then g has a root b in K. The minimal poly- 
nomial gg of b over Z, has degree at most n. Since go divides g and g is 
irreducible, the degree of gis at mostn, U 


Since any two fields with p” elements are isomorphic, we often speak 


of the field of order p” and denote it by GF(p”). Here “‘GF’’ stands for 
“Galois field”’. 


It is reasonable to ask when one finite field can be contained in another. 
THEOREM 5S. The field GF(p” ), p a prime, contains exactly one sub- 


field with p” elements for each divisor m of n and contains no other sub- 
fields. 


Proof. Let L = GF(p”) and let K be a subfield of L. Then K contains 
Z, and |K |= p™, where m = [K:Z,]. Let r = [L:K]. Then n = [L:Z,] = 
iL: K] LK: Z,] = rm, and so m divides n. Every element of K is a root of 
f=xe” _ X and f has at most p™ roots in L. Therefore LZ has at most one 
subfield of order p”. Now let M be the splitting field of f over L and let 
abe aroot of fin M. Then 


m 
gl =a, 
2m m_.~m m 
14 = (a? yP = g? =a, 


and so, by induction, ge” = ge’ =q Thusa isarootofh=X?" — X. But 


360 FIELDS 


L is a Splitting field for h, so all roots of f lie in L. The set K of roots of 
fin L isa subfield of L of order p”. O 


The workspace CLASSLIB provides several alternatives for performing 
calculations in finite fields. For the fields Z, we can use the procedures 
with the prefix ZV, with W set to p. For GF(p”) with nm > 1 we can use 
ZNXIRRED to find an irreducible polynomial f of degree n in Z, [X] and 
use the procedures with the prefix ZX F after initializing with 


ZNXFINIT PF, 


where # is the vector of coefficients of f. If p” is small, we could construct 
addition and multiplication tables for GF(w”) and use the procedures with 
the prefix FR after initializing with FRINIT. We could even determine 
structure constants for GF(p”) as an algebra over Z, and use the procedures 
with the prefix ZA. 

Let us illustrate these alternatives with GF(8). From 


N«2 
ZNXIRRED 3 
010 0 
110 0 
1 11 0 
11041 
1011 


we see that f= 1+ X + X°> is irreducible in Z,[X]. Thus we may take 
GF(8) to be Z, [X]/< f >, where < f > is the ideal of Z,[X] generated by 
f. The elements of GF(8) may be represented by polynomials of degree 
less than 3. After initializing, 


ZNXFINIT F<+1 101 
we can use the procedures with the prefix ZVXF. 
101 2NXFPROD 111 


O11 ZNXFPOWER 7 


The elements of GF(8) are listed in the rows of the matrix F constructed 
as follows: 


FINITE FIELDS 361 


(IO0<0 
[L+E<0O2 2 2718 


rPOoOorRPOrRPOFrFO 
PR OOrRFrR OO 
RPerRrRR OOD OO 


This matrix was chosen because O and 1 are listed first. To make an addi- 
tion table for GF(8), we first must construct the rank 3 array A such that 
ALI;/7;] is the sum of ELI; ] and EL; J]. Here is one way to construct 
A, 


S< 


SY 
Ds 
NS 
_ 


8 


D 
Ry 


ooo onomonone 
ooo nono meonone 
OCOOCOC0O0CO 
oo oe omome nome) 
PREP PpPE PPP» 
O0O0OCOC0O0 0 
PERE PRPR PR 
BPRERPPPRBPB 
ooo eonomenome 
OOO0OCCAOC Oo 
C2CO0O00CCC0CO 
a 
PREP RPPE PPP 
ooo ono menone 
PEP PPPP PB 
C2CO0O00CCAO0°0 
a eee 
PRE PP PPP 
PPP PPeE PP 
ee en ee 
PPP PBP PEP 


DAZ 


8 
0 
0 
0 
0 
0 
0 
0 
0 
TRAV S$ 
0 
0 
0 
0 
1 
1 
1 


0 
0 
0 
0 
0 
0 
0 
0 
Re 
0 
0 
1 
1 
0 
0 
1 
1 


PROPOR OROWRPRPRRPRPREPB 


FBOROrRORO 
PROOrRKOO 
PERPREOOOO 
FOFPOFORO 
BPROORROO 
PEPE OOOO 
FORPORFRORO 
PROORROO 
PEERODOO 
FPORFPOQORORO 
BPROORKFOO 
PERREOOOO 
FOFPOROFRO 
BPROORROO 
PRREEROODOOO 
FOROROFRO 
BPRPOOPRPOO 
PEPE OOOO 
PROFRORORO 
BPROORKOO 
PRERPROOOO 


1 
A«R ZNXFSUM S 


Since R[LI3;J;] is ELI; ] and SLI;/7;] is EL J; ], the array A has the 
desired property. Now, to get the addition table, we construct the matrix 
PLUS such that ALI;J7; ]isELPLUSLI3;J];3 1]. 


PLUS<2101 2 0 QA ELPLUSL3;513] 
AL3353] O11 
O11 


To construct the multiplication table for GF(8), we need only replace 
ZNXFSUM by ZNX FPROD in the preceding calculation. 


TIMES<2101 2 OQR ZNXFPROD $ 


362 FIELDS 


Now we can use the #R procedures to compute in GF(8). 


PLUS FRINIT TIMES 
3 FRPROD 5 


BEL3;] ZNXFPROD E(5;3] 


Bla: ] 


We close this section with a determination of the number /(n, p) of 
monic irreducible polynomials of degree n in Z,[X]. 


THEOREM 6. For all primes p and all positive integers n 


p” = > mi(m, p), 


where the sum is over all positive divisors m of n. 


Proof. Suppose m divides n. By Theorem 5, the field GF(p”) contains 
a subfield K isomorphic to GF(p” ). If f in Z,[X] is monic and irreducible 
of degree m, then f has m distinct roots in K by Theorem 1 and Corollary 
4. Thus mi(m, p) is the number of elements in GF(p”) having minimal 
polynomial of degree m. If a is in GF(p”), then the degree of the minimal 
polynomial of a divides n, and so the sum of the numbers m/(m, p) as 
m ranges over the divisors of 7 is simply the order of GF(p”). O 


Taking n = 1, 2,3, 4in Theorem 6, we obtain 
p =(1,p), 
p? =I(1,p) + 21(2, p), 
p> = I(1,p) + 3/(3, p), 
p* =1(1,p)+ 21(2, p) + 41(4, p). 


Solving these equations, we find 


10,P)=P, 

1(2,p) = (we — p)/2, 

I(3,p) = (B® — p)/3, 

I(4,p) = (e* — p?)/4. 
In general, it can be shown that 


I(n, p) = iu (*) p™, 


FINITE FIELDS 363 


where again the sum is over the positive divisors of m and wu is the Mobius 
function, defined as follows: 


1, ifk=1, 


u(k) = (-l1)y, 
0 


\ if k isa product ofr distinct primes, 
, | if k is divisible by the square of a prime. 


[The function p is named for the German mathematician and astronomer 
August Ferdinand Mobius (1790-1868). ] 


EXERCISES 


In the following exercises p is a prime. 


1 Show that X°” ~— X divides X?” — X in Z,(X] if and only if 
m divides n. 


2 How many subfields do the following fields contain? 


(a) GF(2°). (c) GF(5**). 
(b) GF(37). 


3 Show that the map o:GF(p” )—>GF(p” ) with o(@) = g@ is an auto- 
morphism of GF(p” ). What is the order of o? 


4 Let fin Z,[X] be irreducible of degree n and let a be a root onyin 
GF(p” ). Prove that the roots of f in GF(p”) are a, a? , a? , 
a?” ~ "Show that any automorphism of GF(p”) is a power of the 
map o of Exercise 3. 


5 Suppose fin Z,[X] is monic and irreducible of degree nm and sup- 
pose one root of f in GF(p”) is a generator of the multiplicative 
group of GF(p”). Show that every root of f in GF(p”) is a gener- 
ator of the multiplicative group. Polynomials with this property 
are sometimes called primitive, although this conflicts with the 
meaning of “primitive” given in Section 4.1]. Find a formula 
for the number of such primitive polynomials and compute the 
number for p = 3 andn= 1, 2, 3, 4. 

6 Verify the correctness of the various calculations presented in the 
example involving computations in GF(8). Continue the example 
by constructing an array of structure constants for GF(8) as an 
algebra over Z,. 


7 Using the description of GF(8) as Z,[X]/< f >, where f = 1 + 
X + X?, describe all the roots of f in GF(8) and also the roots of 
1+X? + X3. 


364 FIELDS 


8 Construct explicit descriptions of GF(9), GF(16), and GF(27) 
along the lines of the example in the text. 


4. FACTORIZATION IN Z, [X] 


If p is a prime, then Z,[X] is a UFD. If f in Z,[X] has positive degree 
n, then any irreducible factor of f must be one of the finitely many poly- 
nomials in Z,[X] of degree at most n. Thus it is possible to factor finto a 
product of irreducible factors. However, this brute force algorithm is much 
too slow to be used when p” is at all large. In 1970 E. R. Berlekamp showed 
in the paper listed in the bibliography that there are much faster algorithms 
for factoring in Z,[X]. This section is devoted to a discussion of variants 
of Berlekamp’s algorithms. 


THEOREM 1. Let A be a commutative algebra over Z,. The map 
T:A— A defined by xT = x? is an algebra homomorphism. In particular, 
T is a linear transformation. 


Proof. As in the proof of Theorem 3.3, the Binomial Theorem modulo 
p shows that (x + y? =x? + y? for allx, yin A. Thus(x t+ y)T =xT + yT. 
If a isin Z,, then a? =a, and so forx in A we have (ax)? = a?x? =ax? and 
hence (ax)T = a(xT). Therefore T is a linear transformation. Since (xy)? = 
x? y? it follows that T is an algebra homomorphism. 0 


The first factorization algorithm rests on the following curious result. 


THEOREM 2. Let A be a finite dimensional commutative algebra 
over Z, and let T:A—>A be the pth power map. Then A is a field if and 
only if both of the following conditions hold. 

(a) 7 is a bijection. 

(b) If x in A satisfies xT = x, then x has the form al for some a in Zp. 

Proof. Recall that by Corollary 6.3.9, injectivity, surjectivity, and 
bijectivity are equivalent for linear transformations of A into itself. 

Suppose first that A is a field. If x in A satisfies xT = x? = 0, then 
x = 0, since fields are integral domains. Therefore J is injective and hence bi- 
jective. If xT = x? =x, then x is a root of the polynomial h = X? — X. 
Since A is a field, h has at most p roots in A. For each a in Z, we have 
(al? =a? 1 =al, and so the p elements al are the only elements x satisfy- 
ing xT = x. 

Now suppose conditions (a) and (b) hold. To show that A is a field, 
we must show that each nonzero x in A is invertible. Consider the ideal 
Ax of A. If Ax = A, then | is in Ax and so 1 = yx for some y in A and x 
is invertible. Thus we may assume Ax # A. The map S:y yx of A onto 


FACTORIZATION IN Z, [X] 365 


Ax is a linear transformation. Let W be the kernel of S. Since Ax is iso- 
morphic as a vector space to A/W, we have 


dimz, W + dimz ,Ax = dimz ,A. 


Suppose y is in WM Ax. Then y = zx for some z in A and y? = zxy = 0. 
But then y? = 0, and so y = O by condition (a). Therefore WM Ax = {0}. 
Since 


dimz ,(W + Ax) + dim z, (WM Ax) = 
dimz , W + dimz ,Ax = dimz ,4, 


it follows that dimz , (W + Ax) = dim 7z,A,and so W + Ax = A. Therefore 
A is the internal direct sum W ® Ax. Thus there exist u in W and vy in Ax 
such that 1 =u +v. Then 1 = 1? = u? + vy? , Now Ax isan ideal, so v? is in 
Ax. It is also true that W is an ideal, for if yx = 0 and z isin A, then (zy)x = 
z(yx) = O. Therefore uv? is in W. But 1 can be written as a sum of something 
in W and something in Ax in only one way. Thus uw? = u and v? = ». If 
u = O, then v = 1 is in Ax, which we are assuming is not the case. If vy =O, 
then 1 = wu is in W and so x = ux = 0, again contrary to our assumption. 
Thus u and y are both nonzero. Since u is in W and » is not in W, it follows 
that u and vy are linearly independent. But, by condition (b), uw and v must 
each be of the form al for some a in Z,. This is again a contradiction, so 
A isafield. OU 


Conditions (a) and (b) of Theorem 2 are not difficult to check pro- 
vided we can find the matrix QO for T with respect to some basis x,,...,Xn 
of A. Condition (a) is equivalent to det O #0. If Y is the coordinate vector 
of an element y of A, then yT = y if and only if YO = Y or YO —-DH=0, 
where J is the n-by-n identity matrix. Thus y7 = y if and only if Y is a solu- 
tion of the homogeneous linear system (Q’ — J)Y = O. Since the dimension 
of the set of solutions of this system is m — r, where r is the rank of Q' — 
condition (b) is equivalent tor=n — 1. 

Let us use Theorem 2 to decide whether f = 1 + 2X + X? + 2X? + 
2X* + X©° andg=2+2X+ X27 +X? + 2X4 + 2X° 4+ X° are irreducible in 
Z;[X]. By Corollary 5.4.4, fis irreducible if and only if A = Z3;[X]/<f> 
is a field. We will work in A using the ZVXF procedures. 


OI0<1 
N<«3 
ZNXFINIT Fe1 212201 


If 


366 FIELDS 


F<T<+(16)°.=16 


1000 0 0 
O10 0 0 0 
O01 0 0 0 
00010 0 
0000 1 0 
0000 0 1 


then the rows of I correspond to the basis 1, X, X?, X°, X*, X° of A. If 


U<@<Lf ZNXFPOWER 3 
0 0 


NORFP MONO FR 
NOM ORF OO O 
PRON O Oo 
CORN PPR 
NOM OF © 


then eachrow of @ is the cube of the corresponding row of J. Since the rows 
of @ are all elements of the subspace of A spanned by 1, X, X?, X?, X%, 
the cube map TJ does not map A onto itself, so A is not a field. Thus f is 
reducible. How can we find a factorization of f? Since T is not bijective, 
there is a nonzero element y of A such that y>® = 0. To find such an ele- 
ment, we must solve the homogeneous linear system with matrix Q@. The 
rows of the matrix W obtained by 


(®Q) ZNLSYS 5p0 
000 0 0 0 


are a basis for the set of solutions of this system. The first row of W cor- 
responds to the polynomial # = 1 + X + X?. Since f divides h? but f does not. 
divide h, the greatest common divisor of # and f must be a proper nontrivial 
factor of f From 


O<«U<Wl1i;] ZNXGCD F 
110 1 

F ZNXQUOT U 
1101 


we find the factorization f= (1 + X + X3)*. Now let us turn to g. Proceeding 
as before, 


FACTORIZATION IN Z [X] 367 


ZNXFPINIT G2 213122 1 
U<@<IL ZNXFPOWER 3 


1000 0 0 
0001 0 0 
1142211 
211102 
212 0 2 0 
21011 0 
ZNDET @ 


we find this time that T is bijective. We must now consider condition (b) 
of Theorem 2. As noted, we want to know the solutions of the homogeneous 
linear system with matrix QQ-I. From the calculation 


(QQ-I) ZNLSYS 6p0 
000 0 0 0 

W 
100 0 0 0 
O12101 
We see that h = X + 2X? + X° + X° is an element of A that is not of the 
form al for some a in Z; and that satisfies 


0=h? —h=h(h —1)(h—- 2). 


Thus condition (b) of Theorem 2 fails and g is not irreducible. To find a fac- 
torization of g, we note that one of the elements h, h — 1, or h — 2 must 
not be invertible. Therefore the greatest common divisor of g and one of 
h,h — 1, or h — 2 must be a proper factor of g. Computing the greatest 
common divisors, 


Q 


ZNXGCD H<«0 121041 
G ZNXGCD H ZNXDIFF 1 
; ZNXGCD H ZNXDIFF 2 
; ZNXQUOT 102 1 
we find that g= (2+ 2X + X?) (1 +2X? + X?). 
In the second example we see one drawback of this approach. If con- 


dition (a) holds but condition (b) fails, then to find a factorization we may 
need to compute p greatest common divisors. If p is large, this will be time- 


368 FIELDS 


consuming. Berlekamp suggested the use of a random or probabilistic pro- 
cedure in case p is large. A random procedure is one in which random 
choices are made. The probability that a particular choice will yield a solu- 
tion is good. However, if one is extremely unlucky, a great many choices 
will have to be made before the solution is obtained. One such probabilistic 
algorithm for factoring in Z, [X] is based on the following theorem. 


THEOREM 3. Let f and g be relatively prime polynomials in Z,(X], 
p a prime. Then 


LZy(X\/<fg > = (Z,[X]/<f>) © (Z,[X]/<g >), 
where = denotes isomorphism of Z, -algebras. 


Proof. This theorem is essentially the same as Corollary 3.6.2. We 
can map Z,[X] into A = (Z,[XV< f >) ® (Z,[XV <g >) by mapping a 
polynomial h to ({h]+, [A],), where [h]; denotes the coset h + < f >. 
If 4 is in the kernel of this map o, then fand g each divide h. Since gcd(f,g) = 
|, this means fg divides h. Thus the kernel of o is < fg >. By the First Iso- 
morphism Theorem, Z,[X)/ < fg > is isomorphic to the image of o. But 
Zp [XV < fg > has dimension deg(fg) and A has dimension deg(f) + deg(g). 
Since deg(fg) = deg(f) + deg(g), it follows that Z,|XV < fg > and A have 
the same dimension. Therefore o maps Z,[X] onto A and Z, [XY < fg > 
is isomorphic to A. O 


COROLLARY 4. Suppose f in Z,[X] is equal to fi f, ... fm, where 
the f; are distinct monic irreducible polynomials of the same degree 7. 
Then Z, [X]/ < f > is isomorphic to the direct sum of m copies of GF(p” ). 


Proof. By Theorem 3, Z,[X]/ <f > is isomorphic as a Z,-algebra to 
the direct sum of the algebras Z,[X)/ < f; >, which are all isomorphic to 
GF(p”). U 

To formulate our random factorization algorithm, we must distinguish 


the two cases p > 3 and p= 2. The case p > 3 is slightly simpler, and we con- 
sider it first. 


THEOREM 5. Let f=f; ... fm in Zp,[X], where p is an odd prime, 
the f; are distinct monic irreducible polynomials of the same degree n, and 
m > 1. Let e = (p” — 1)/2. Suppose a polynomial g of degree less than mn 
is chosen randomly in Z,[X]. Then the probability that gcd(f, g* — 1) is 
a proper factor of fis 


a=1- xm [(i+ 4) + G-4)"], 


which is at least 4/9. 


Proof. In choosing a random element g in Z,[X] of degree less than 


FACTORIZATION IN Z, [X] 369 


mn we are choosing a random element of Z,[X)//<f >, which by Corollary 
4 is isomorphic to the direct sum A of m copies of GF(p”). Every element 
of GF(p” ) is a root of 


Xe" _ X= X(Xe"-1 _1)= XX? — 1) (X% + 1). 


Thus for e elements a of GF(p”) we have a® = |, for e elements we have a® = 
—l, and for one element, a = 0, we have a® = O. If x =(@,,...,4m) isan 
element of A, then x° = (af, ...,a%,) has only O, 1, and — 1 as compo- 
nents. The number of x such that x° has no component equal to 1 is (e + 
1)”, and so the number of x such that x° has at least one component 
equal to 1 isp”” — (e+ 1)”. Of these, the number of x such that x° has all 
components 1 ise”. Thus the number of x in A such that x° has at least one 
component equal to 1 and at least one component different from 1 is p”” — 
(e + 1)” —e™. For any such x the element x — 1 is a nonzero noninvertible 
element of A. If g in Z,[X] corresponds to one of these x’s, then g° — 1 is 
neither relatively prime to f nor divisible by f, so gcd(f, g® — 1) isa proper 
factor of f. The probability of this occurring is 


j-E arn e _,_(ety" _(£)" 


“1 d[( 2)” + 0-4)" ] 


Now 1 — 1/p” < land 1+ 1/p” < 4/3 as p” > 3. Thus q is at least 


LO GY &. 


This is an increasing function of m and, for m = 3, its value is 


If m = 2, then 


370 FIELDS 


If in the situation of Theorem 5 we were to choose 50 elements g at 
random, then the probability that no proper factor of f would be found 
would be at most (5/9)°°, or about 1.7 X 107!3. To compute gcd(f, g® — 1), 
we first raise g to the power e modulo f using the binary power algorithm, 
subtract 1, and compute the greatest common divisor with f, 

Let us apply Theorem 5 to an example. Let p = 2281 and f= 1815 + 
1152X + 1804X? + X°? in Z,[X]. It is a fact that fis the product of three 
distinct monic polynomials of degree 1. To find them, we initialize to work 
in Z,[X]/<f>. 


OI0<0 

N<+2281 

ZNXFINIT F<+1815 1152 1804 1 
E<«(N-1)#2 


We choose a random polynomial g of degree at most 2 


U<+G<?3oN 
765 551 1313 


and compute g* modulo f, 


U<H<G ZNXFPOWER E 
1755 1344 251 


Next, we compute gcd(f,g° — 1). 


F ZNXGCD H ZNXDIFF 1 
734 487 1 

F ZNXQUOT F1<734 487 1 
1317 1 


Here we have obtained the partial factorization f= (X + 1317)f,, where f, = 
734 + 487X + X*. To complete the factorization, we repeat the process 
with f,. 


ZNXFINIT F1 

G<?2oN 

H<G ZNXFPOWER E 

F1 ZNXGCD H ZNXDIFF 1 


Here we were unlucky and did not find a proper factor. Trying once more, 


FACTORIZATION IN Z, [X] 371 


G<?2oN 

H<G ZNXFPOWER E 

Fi ZNXGCD H ZNXDIFF 1 
847 1 

F1 ZNXQUOT 847 1 
1921 1 


we obtain the factorization f = (847 + X) (1317+ X) (1921 +X). 

In order to use Theorem 5 with a polynomial f about whose factoriza- 
tion we have no information, we must consider possible factors one degree 
at a time, starting with degree 1. Suppose we know that fhas no nonconstant 
factors of degree less than m. Then any factor of degree m is irreducible and 
so is also a factor of XP” — X. Thus g = gcd(f, X?” — X) is the product 
of the distinct monic irreducible factors of f of degree m. We can factor g 
using Theorem 5 and remove all irreducible factors of degree m from f and 
then proceed to degree m + 1. 

Theorem 5 is based on the factorization X?" — X = X(X% — 1)(X% + 
1), where e = (p” — 1)/2. This factorization is valid only for odd primes 
p. If p = 2, then there is another factorization of X ep" _ X that we can use 
to get a result similar to Theorem 5. 


THEOREM 6. Let i be the polynomial 
X+ X74 X44+...4+X 
inZ,[X]. Then X2” — X=h(h + 1). 
Proof. We have 
nat ly=hth2=X+X24...4 X27 4 X24 KF 4 x2" 
=X+xX2"= x72" _¥X 


an—1 


since squaring is a linear transformation and 1=—linZ,. 0 

Using Theorem 6, we can prove the following. 

THEOREM 7. Let f =f; ... fim in Z,[X], where the f; are distinct 
irreducible polynomials of the same degree m andm>1.Leth=X+X? + 
a 2p <n Suppose a polynomial g of degree less than mn is chosen 
randomly in Z,[X]. Then the probability that gcd(f, h(g)) is a proper 
factor of fis 


Proof. See Exercise 4. ( 


The procedure ZVXFACTOR factors a polynomial f by removing fac- 
tors of degree m for m=1,2,.... The product g of the distinct monic irre- 


372 FIELDS 


ducible factors of f of degree m is computed. If deg(g) = rm with r > 2, then 
g is factored using Theorem 5 or Theorem 7 if p™ is large and by dividing 
g by all monic polynomials of degree m if p™ is small. 


N+2281 
PF 

1815 1152 1804 1 
ZNXFACTOR F 


847 1 
1317 1 
1921 1 


Here we have recomputed the factorization of 1815 + 1152X + 1804X? + 
X? in Z,[X], p = 2281. 


EXERCISES 


1 For each of the following primes p and polynomials f in Z, [X], 
factor f into irreducible factors using the algorithm based on 
Theorem 2. 

(a) p=2,f=1+X4*4+X°>. 

(b) p=3,f=l+X+2X*24+2X5 4X5, 

(c) p=2,f=1+X+X24+X4* +X8. 

(d) p=5,f=1—X—-—X?+2X4*-X5 4X, 

2 The polynomial f= 2+ 11X + 8X? +4X4* +22X5 + X® in Z;, [X] 
is a product of two distinct monic irreducible polynomials of degree 
3. Factor f using the method of Theorem 5. 

3 The polynomialg=1+X+X5 +X®4+X74+ X88 4+X9 +X! + 
X}* in Z,[X] is a product of two distinct irreducible polynomials 
of degree 6. Factor g using the method of Theorem 7. 


4 Prove Theorem 7. 


State and prove versions of Theorems 5 and 7 that describe how to 
find a nonzero noninvertible element of a commutative algebra 
over Z, that is known to be isomorphic to a direct sum of m 
copies of GF(p” ), where m > 2. 


LINEAR 
TRANSFORMATIONS 


This chapter is devoted to the study of elements of the ring Ende(V), where 
F is a field and V is a finite dimensional vector space over F. By Corollary 
5.3.7, Endr(V) is isomorphic to M,(F), the ring of n-by-n matrices over 
F, where nm = dimr(V). We know that the group GL, (F) of units of M,,(F) 
is the set of n-by-m matrices with determinant different from 0. One of our 
goals will be to describe the equivalence classes of elements of M, (F) under 
an equivalence relation that extends the notion of conjugacy in GL, (Ff) to 
all of M, (F). 


1. SIMILARITY 


Throughout this section V will be a vector space of dimension 7 over a field 
F. Let A be the algebra End;(V). If v is in V and T is in A, then vT is, as 
usual, the image of vy under T. This map from V X A to V makes V aright 
A-module. Fix an element T of A. The evaluation map taking fin F[X] to 
f(T) is an F-algebra homomorphism of F[X] into A. This homomorphism 
allows us to consider V to be an F[X]-module. If v isin V and fisin FLX], 
then vf will mean vf(T). To emphasize the point that different choices of 
T lead to different F[X]-module structures on V, we will write V7; for 
V considered as an F[X]-module. We would like to be able to say when 
two linear transformations S and TJ in A yield isomorphic F'[X]-modules 
Vs and Vy. We will use the complete description of finitely generated 
modules over the Euclidean domain F[X] given in Sections 5 and 6 of Chap- 
ter to solve this problem. 

In Section 3.10 we showed that conjugation by an element of a group 
G is an automorphism of G. The next theorem is a generalization of this 
result. 


THEOREM 1. Let u be a unit of a ring A. The map 0:A——A such that 
xo = u~'xu is an automorphism of A. If A is an algebra over a commutative 
ring R, then o is an R-algebra automorphism of A and, for all x in A and 
all fin R[X], we have f(x)o = f(xo). 


373 


374 LINEAR TRANSFORMATIONS 


Proof. For all x and y in A we have 
(xy)o =u txyu =u !xuu yu = (xo) (yo) 
and 
xty)o=ulb(xty)u=utxu tu yu = (xo) + (yo). 

Since lo = u~!u = 1, we see that o is a ring homomorphism. If xo = 0, 
then 0 = u~!xu, and so x = uOu! = 0. Therefore oa is injective. Also, since 
(uxu~')o = u~'uxu~1u = x, we see that o is surjective and therefore o is 
a ring automorphism. If A is an R-algebra for some commutative ring R, then 
for a'la in R we have (ax)o = u~!(ax)u = a(u7!xu) = a(xo) and hence a is 
an automorphism of A as an R-algebra. Suppose f=ad) +a,X+...+a,X" 
isin R[X]. Then 


f(xo) =ap +a,(xo) +... +a,(xa)” 
= (dg ta,xt...ta,x")O 


=f(x)o. U 


Theorem 1 gives us our first answer to the question of when Vs and 
Vr are isomorphic F[X]-modules. 


THEOREM 2. Let S and T be in Endr(V). Then Vs and V7 are iso- 
morphic as F[X]-modules if and only if there is a unit P of End -(V) such 
that T= P' SP. 


Proof. Suppose first that Vs and V7 are isomorphic and let P:Vs—Vr 
be an F[X]-isomorphism. Then P is certainly an F-isomorphism, that is, 
an isomorphism of vector spaces, and so P is a unit in End;(V). For all 
fin F[X] and all vy in Vs we have vf(S)P = vPf(T). Take f= X. Then vSP = 
vPT for all vy in V. Therefore SP = PT or T=P7'SP. 

Conversely, suppose 7 = P=! SP for some unit P of End-(V). Suppose 
f isin F[X]. It follows from Theorem | that f(T) = f(P 1SP) = Pf (S)P 
or Pf(T) = f(S)P. Then, for all v in Vs, we have vf(S)P = vPf(T), so P is an 
F[X]-isomorphism of Vs onto Vr. 


If S and T are in Endr(V), we say S and TJ are similar if there is a 
unit P of Endg(V) such that T = P71! SP. Theorem 2 says that Vs and Vr 
are isomorphic if and only if S and J are similar. We define similarity of 
matrices in an analogous manner. Matrices A and B in M,,(F) are similar 
if there is an invertible matrix Q such that B = Q7!AQ. It is easily checked 
(see Exercise 1) that similarity is an equivalence relation for both linear 
transformations and matrices. The following theorems show the relation 
between these two notions of similarity. 


THEOREM 3. Let 7 be in End;(V) and let v,,...,¥», and w,,.. 


es 


SIMILARITY 375 


w, be bases of V. If A and B are the matrices for T with respect to v,,.. 
v, and W,,..-., Wn, respectively, then A and B are similar. 


Proof. Let Q be the matrix such that 


a 


wi= & Opry 
J 


Then Q is in GL, (F) and, by Corollary 5.3.10,B =QAQ7? =(Q71)7?AQ™. 
Thus A and B are similar. 0 


THEQREM 4. Let S and T be elements of End;-(V). The following 
are equivalent. 


(a) S and 7 are similar. 


(b) For every basis v,, ...,¥, Of V the matrix of S with respect to 
Vi,+..,%m is similar to the matrix for J with respect to the same 
basis. 

(c) There exist bases v,,...,¥, and w,,...,wW, of V such that the 
matrix of S with respect to y,,...,V, is the same as the matrix 
of T with respect to w,,...,Wn. 


Proof. (a) > (b). Suppose T = P!SP for some unit P in Endr(V). 
Let A, B, and Q be the matrices of S, 7, and P, respectively, with respect 
to the basisv,,...,¥,. Then B = Q71AQ, and A and B are similar. 

(b) = (c). Let w,,...,W, be any basis of V and let A and B be the 


matrices of S and T with respect to w,,...,W,. By (b), there is a matrix 

Q such that B = QAQ™. By Corollary 5.3.10, QAQ™ is the matrix of 
S with respect to the basis v,,...,v,, where 
V;= > Oi; w;. 

(c) > (a). Suppose v,,...,¥%, andw,,...,W, are bases of V such 

that the matrix A of S with respect to v,,...,¥, iS the same as the matrix 


of JT with respect tow,,...,wW,,. Thus 


VS = > Aj;;, 
J 

w;T = x A jjW;. 
J 


There is a unique element P of End; (V) such that v,;P = w;. Since P is clearly 
surjective, P is a unit of Ende (V) and w;P! = v;. Thus 


376 LINEAR TRANSFORMATIONS 
w,P SP = v,SP = (> 4595) P 
~ = 2 A v,P 


_ Aw 77 wT 
Thus P~!SP and T agree on w,,..., W,, and so they agree on all of V. 
Therefore T=P71SP. 0 


Let us consider an example in which F is Z, and V is the set of poly- 
nomials in Z,[X] of degree at most 3. Let D:V—~V be the restriction of 
the derivative map to V. Thus 


(ag ta,X +a,X* +a3,X7)D=a, + 2a,X + 3a3X?. 


Since 
ID=0, 
XD=1, 
X*D=2X, 
X3 D = 3X2, 


the matrix of D with respect to the basis 1, X, X?, X? of V is 


U<+A<4 400 0001000020000 3 0 
0 


OorF Oo 
ONO O 
WOOO O 


0 
0 
0 


Now define 7:V—V by 
(ag ta,X +a,X* +a3,X7)T =a3X + 3a) X? + 2a,X3. 


Since 
1T= 3X?, 
XT=0, 
X*T=2X3, 
X°T=X, 


the matrix for T with respect to 1,X,X*, X? is 


SIMILARITY 377 


U<+B<4p00 0300000000201 0 0 


0 0 3 0 
0 0 0 0 
00 0 2 
O10 0 


If 2 is the matrix given by 


OI0<0 
O<+@<(14)0°.=1 3 2 0 
0001 
100 0 
0010 
0100 


then @ is invertible with inverse QJ obtained as follows. 


<7 
O<@I<ZNMATINV Q 
010 0 
0001 
001 0 
100 0 


The calculation 


Q@ ZNMATPROD A ZNMATPROD QI 


OOO Oo 
re OO © 
Oo OO W 
ON OC O 


shows modulo 7 that B is Q+.xA+.xQJ,so A and B are similar matrices 
and hence D and 7 are similar linear transformations. The matrix @ is the 
transition matrix from the basis 1, X, X?, X3 of V to the basis X?, 1, X7, 
X. Therefore B is the matrix for D with respect to this nonstandard basis. 

If T is in End-(V), then the set J of polynomials g in F[X] such that 
g(T) = O is an ideal of F[X]. Now End;(V) is isomorphic to M, (Ff), which 
has dimension n? over F. Thus 1, g,g’,..., gn? cannot be linearly inde- 
pendent over F’,, so there exist ag,...,4,2 in F, not all O, such that ag + 
aigt+...t+a2g"" =0. Thus J  {0},and so J consists of all multiples of 
a unique monic polynomial f, We call f the minimal polynomial of T. By the 
preceding argument, deg(f) < n?. In fact, as we will show in the next sec- 
tion, deg(f) < n. We can also define the minimal polynomial of a matrix 


378 LINEAR TRANSFORMATIONS 


A in M,(f) to be the monic polynomial f of smallest degree such that 
f(A) = 0. 

In Section 7.1 we defined minimal polynomials of elements in exten- 
sion fields. In that situation minimal polynomials are always irreducible. 
However, the minimal polynomials of linear transformations and matrices 
need not be irreducible. For example, the matrix 


LI<A<2 2 p91 00 2 
1 0 
0 2 
in M,(Q) has minimal polynomial f= X2 — 3X +2 = (X — 1) (X — 2). 
We can verify that f(A) = 0 as follows. 


U<+T<+(12)°,=12 
1 0 
O 1 
(A+.xA)+( 3x4)+2xI 
0 0 
0 0 


Therefore the minimal of A divides f. Since A is not a rational multiple of 
I, the minimal polynomial of A cannot have degree 1, and so the minimal 
polynomial is f, which is certainly not irreducible. 

THEOREM 5. Let T be in End;(V) and let A be the matrix of T 
with respect to a basis v,,...,v, Of V. Then 7 and A have the same mini- 
mal polynomial. 

Proof. The map that takes T in Endr(V) to its matrix A in M,(F) 
with respect to v,,...,V, is an F-algebra isomorphism. Thus if ajy,...,@m 
are in F, thenag +a,7+...+a,T” is O in End;-(V) if and only if a) + 
a,At+...+a,A” isOinM,(F). Therefore T and A satisfy the same poly- 
nomials and so have the same minimal polynomial. U 

We close this section with a simple but useful observation. 

THEOREM 6. If A and B in M, (F) are similar, then det A = det B. 

Proof. If A and B are similar, then B = Q7! AQ for some Q in GL, (F). 
Thus 

det B = (det O') (det A) (det Q) =det A, 


since det 97! =(det G)7. Q 


EXERCISES 
1 Let R be a ring and let x and y be in R. Define x to be similar to 


SIMILARITY 379 


y if there is a unit u in R such that y = u~!xu. Show that similar- 
ity is an equivalence relation on R. 


Suppose S and 7 are similar linear transformations of the vector 
space V into itself. Show that the kernels of S and T have the 
same dimension and that the images of S and T have the same 
dimension. 


Generalize Exercise 2 by showing that for all polynomials f in 
F[X] the kernels of f(S) and f(T) have the same dimensions and 
the images of f(S) and f(T) have the same dimensions. 


Let A, B, and Q be in M,,(F) with A and B fixed and Q variable. 
Show that the matrix equation QB = AQ is a system of linear 
equations on the entries of Q. Show that A and B are similar if and 
only if there is a QO such that OB = AQ and det OQ #0. (The condi- 
tion det OQ #0 is not a linear condition on the entires of Q.) 


In M,(Z; ), let 


0 ] | | 
A= and B= . 
—] 0 ] —] 


Show that A and B are similar by solving the linear system OB = 
AQ and exhibiting an invertible solution Q. 


In M, (Q), let 


let V = Q © Q, and let S and T be the elements End g(V) whose 
matrices with respect to the standard basis of V are A and B, re- 
spectively. Show that the kernel and image of both S and T have 
dimension 1. Find vectors v, and w, spanning the kernels of S and 
T, respectively, and find vectors v, and w, spanning the images of S 
and T, respectively. Show that v,, v, and w,,w, are each bases of 
V. Compute the matrix of S with respect to v,, v, and the matrix 
of T with respect to w,,w,. Are A and B similar? 


Show that the matrices 


] 0 ] 0 
A= and B= 
2 ] 4 ] 


are similar in M,(Q) but not in M,(Z). That is, there does not exist 
a unit Q of M,(Z) such that B = Q71AQ, but there is a unit of 
M,(Q) with this property. 


380 LINEAR TRANSFORMATIONS 


8 Prove that similar matrices have the same minimal polynomials. 
9 InM,(Z, ) let 
Li. 
A= , 
] 2 


Let Co, C1, Cz be in Z,. Show that the matrix equation cg + c,A + 
c,A?* = 0 corresponds to a system of homogeneous linear equations 
in the c;. Describe all solutions of this system and compute the 
minimal polynomial of A. 


2. RATIONAL CANONICAL FORM 


In the previous section we defined similarity of linear transformations and 
of matrices and we showed how to reduce the problem of deciding when 
two linear transformations are similar to the problem of deciding when 
two matrices are similar. In the exercises several approaches to deciding 
similarity of matrices were introduced. These approaches are useful in special 
cases but are not adequate in general. In this section we present an algo- 
rithm for deciding when two n-by-n matrices A and B over a field F are 
similar and, when they are, for finding an element Q of GL, (F) such that 
B=Q"AQ. 

Let V be an n-dimensional vector space over F and let T be in End (V). 
The module V7 is a finitely generated F[X]-module; therefore, by the re- 
sults of Sections 6.5 and 6.6, there are unique monic polynomials f,,...,/f; 
of positive degree in F[X] with f; dividing f;4,, 1 <i<r,and a nonnegative 
integer m such that V7 is isomorphic to 


(FLX|/<f, >)®...8 (F[LX|/<f, >) O FLX)”. 
Since V is finite dimensional over F and F[X] is not finite dimensional, 
m must be 0. Therefore V7 is isomorphic to 
(FLIXV/<f, >)O...08 (FLX|/<f, >). 


Since f,, ..., f, determine Vr up to module isomorphism, by Theorem 
1.2 the f; determine 7 up to similarity. Thus the problem is how to de- 
termine the f;. 


The standard way of describing T is to choose a basiS v,,...,¥n Of 
V and give the matrix A of T with respect to v,,...,Vn.NOWY,,..., Yn 
span V over F, and sov,,...,¥, certainly generate Vy; as an F[.X]-module. 


Thus the map of F[X]” to Vr taking (g,,...,8,) t0v,;2,; t...+%n2, is 
a surjective F[X]-homomorphism. Let M be the kernel of this map. Then 
Vr is isomorphic to F[X]"/M. Since M is a submodule of F[X]”, we know 


RATIONAL CANONICAL FORM 381 


M can be generated by n elements. If we knew a matrix C over F[X] such 
that the rows of C generated M, we could determine the f; by reducing 
C over F[X] using row and column operations as described in Section 6.5. 
The next theorem tells us one such matrix. 

THEOREM 1. Let T be in Endf(V), let A be the matrix of T with 
respect to the basis v,,...,¥, Of V,and let J be the n-by-n identity matrix. 
Then Vy is isomorphic to F[X]"/M, where M is the submodule of F[X]” 
generated by the rowsof A — XJ. 

Proof. As written, the matrix A — X/ is not in M,(F[X]) but, in- 
deed, in M,,(F)[X]. However, we remarked in Section 4.5 that these rings 
are isomorphic, and we agreed to identify them. 

Let M be the kernel of the map 7 of F[X]” onto Vy; mapping (g,,..., 
2,) tov,g, +... +%,8,. We will show that M is generated by the rows of 
A — XI. 

The ith row of A — X/J/is the vector 


(Aije-++ 43-1, Au —X,Aii4+1,--+,Ain). 
The image under 7 of this vector is 
VjAy t+... +71 Api t VAG — T) + %4 1 Apiey +... tn Ain, 
which is 
x A,jv; — »,T. 
j 
However, by the definition of A, 
v;T = x Aji; 
j 
so T maps each row of A — XJ to O. Therefore the rows of A — X/ are in M. 
Let N be the submodule of F[X]” generated by the rows of A — XJ. 
Then N C M. To show that N = M, it is sufficient to show that the dimen- 
sion of F[X]”/N_ is the same as the dimension of F[X]”/M, that is, n. 


(Why?) Let D be the matrix that is reduced over F[X] and equivalent to 
A — XI, Then 


&1 


En 


382 LINEAR TRANSFORMATIONS 
where the elementary divisors g; are monic polynomials and g; divides 
2:41, 1 <i<n. There exist units P and Q of M, (F[LX]) such that 
D=P(A — XQ. 
Therefore 
det D = (det P)[det(A — XJ)] (det Q). 


But det P and det Q are units in F[X], that is, they are nonzero elements of 
F. Therefore det D and det(A — XJ) have the same degree. Now A — XJ 
looks like 


A,,—-X Ay : 
A», A,—-X A 3 


Ann—X 
Of the n! terms in det(A — XJ), every one has degree less than n except 
(Ay, —X) (Ax. —X)... (Ann — X) 


which has degree n and leading coefficient (—1)”. Therefore det(4 — XD 
has degree m and hence so does detD. By Corollary 6.5.3, the module 
F[X]"/N is isomorphic to F[X]"/L, where L is the submodule of FLX]” 
generated by the rows of D, and F[X]”/L is isomorphic to 

(FLX)/<g; >) 0... OFIX)/<2, >). 


The dimension of F[X]/<g; > is deg(g;), so F[X]”/N has dimension 


a 


py deg(g;) = deg(g, ...&,) =deg(detD) =n. LU 


COROLLARY 2. If A and B are in M,(F), then A and B are similar if 
and only if A — XJ and B — X/ are equivalent over F[X]. 


Proof. Let V = F” and let S and T be the elements of End;(V) defined 
by vS = vA and vT = vB. Then A and B are the matrices of S and 7, re- 
spectively, with respect to the standard basis of V. Now A and B are similar 
if and only if S and T are similar, which is true if and only if Vs; and V7 are 
isomorphic. By Theorem 1, Vs is isomorphic to F[X]"/M and V7 is iso- 
morphic to F[X]"/N, where M and N are submodules of F[X]” generated 
by the rows of A — XJ and B — XI, respectively. By the results of Sections 
6.5 and 6.6, F[X]”"/M and F[X]"/N are isomorphic if and only if A — X/ 
and B — X/ are equivalent. ( 


RATIONAL CANONICAL FORM 383 


In the proof of Theorem | the determinant of A — XI played an im- 
portant role. This element of F[X] is called the characteristic polynomial 
of A. We will encounter characteristic polynomials several times in this sec- 
tion and again in Section 3. The characteristic polynomial of 


A= 2 0 ] 
—3 l 2 
is 
1-X 3 —] 
det 2 —X l =—24+8X+3X? —X?. 
—3 l 2—X 


In CLASSLIB there are procedures for reduction of matrices over 
F[X], where F is Z,, p a prime. Let 


N<«7 

U<+A<+3 3905 4+ 4 6 34 1 3 2 
59 4 4 
6 3 4 
1 3 2 

U<+B<+3 395 304 3503 1 
5 3 0 
4 3 9 
0 3 1 

U<«C<+3 3p1 4 223 20 0 6 
1 4 2 
2 3 2 
0 0 6 


Let us use Corollary 2 and the procedure ZNXREDUCE to test the similar- 
ity of A, B, and C considered as elements of GL3(Z,). The matrix A — X]/ 
is represented by A1, defined as follows. 


(1<T<«(13)°.=13 DAZV A1i<+(3 3 10,A4),-I 
100 5 1 4 0 4 0 
010 6 0 3 61 4 0 
001 1 0 3. 0 2 1 


[The array 3 3 1p ,A represents A considered as an element of 
M;3(Z3[ X]).] 
Reducing Ai, 


384 LINEAR TRANSFORMATIONS 


DAZV ZNXREDUCE Al 
0 0 0 0 0 0 0 0 
0 0 1 1 0 0 
0 0 0 0 0 3 


OOF 


0 0 
2 1 


we see that A — XT is equivalent to 


l 0 0 
D= | 0 1+X 0 
0 QO 2+3X+X? 
Similarly, 
DAZV ZNXREDUCE (3 3 19,B8),-1 
10 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 1 0 0 QO 0 0 0 0 
0 0 0 0 0 0 0 0 23 5 1 
DAZV ZNXREDUCE (3 3 10,C),-I 
1 0 0 0 0 0 0 0 0 
0 0 0 1 1 0 0 0 0 
0 0 0 0 0 0 2 3 1 
we find that B — XJ is equivalent to 
] 0 0 
E= | 0 ] 0 


0 O 24+3X+5X7+X3 


while C — XI is equivalent to D. Thus A and C are similar in M3(Z3) and 
B is not similar to either A or C. 

It is convenient to be able to choose one representative from each 
similarity class of matrices in M,,(F). Such sets of representatives are called 
canonical forms. We will now describe one canonical form. 

If R is a commutative ring and J is any ideal of R, then M = R/J is an 
R-module and multiplication of elements of M by a fixed element a of R is 
an R-endomorphism. Let us consider the case in which R = F[X] and J = 
< f >, where f is a monic polynomial of positive degree n. Multiplication 
by X is an F[X]-endomorphism T of M = F[X]/J and so, in particular, 
T is in Endr(M). Let v; = X'+J,0 <i<n. Then vg,...,%,_,_71i8a basis of 
M. What is the matrix of T with respect to v9,...,¥,_-12 1f0 si<n—-l1, 
then v;T = (X? + NX = X**! + J = v;,,. Suppose f=a) ta,X+...+ 
A, 1 X"~1+4+X"; then 


RATIONAL CANONICAL FORM 385 


vn pT = (X71 +X HX" +I = ay — a, X —... — ay _ XN 4d 
= —AyVq — QV, —... — An _1Yn_1- 
Thus the matrix of T is 
O 1 O 
Oo 1 O 
C= 
| 0 ] 
—Ayg —a, i a | 


We call C the companion matrix of f. For example, if f= 2 — 3X + X? — 
4X3 + X* in R[X], then the companion matrix of f is 


0 1 0 O 
10 +O 1 0 
“10 0 0 
2 3 -1 4 


THEOREM 3. Let f be a monic polynomial of positive degree 7 in 
F[X] and let C be the companion matrix of f Then the minimal poly- 
nomial of C is f and the characteristic polynomial of C is (—1)’f. 


Proof. As noted, C is the matrix for the linear transformation T of V = 
F[X]|/< f > defined by multiplication by X with respect to the basis 1 +J, 
X+J,...,X"7~! +J, where J = <f >. By Theorem 8.1.5, C and T have 
the same minimal polynomial g. Now g(7) is the linear transformation of 
V defined by multiplication by g. Since g(7) = 0, we must have 


O=(l+)g=gtJ, 


so g is in J. That is, f divides g. But (X¥' + Nf = X'f+J=J, so f(T) = 0. 
Therefore g divides f. Since f and g are monic, they must be equal. 
Now let D be the reduced matrix over F[X] equivalent to C — XJ 
Let M be the submodule of F[X]” generated by the rows of D. Then 
F[X]"/M is isomorphic to V = F[X]/<f>, so 
l 
0 


386 LINEAR TRANSFORMATIONS 


As noted in the proof of Theorem |, the determinant of C — X/ is a multiple 
of det D by an element of F. Since det D = f is monic and det(C — XJ) has 
leading coefficient (—1)”, we have det(C — XN =(-1)"fi O 


The matrix 


Oo oO © 


| 0 
0 | 
0 0 


—- © O 


6 3 l 2 


in M,(Z,,) is the companion matrix of f= 5 + 8X + 10X? + 9X3 +X% in 
Zii:[X]. By Theorem 3, the minimal and characteristic polynomials of C are 
both f. 

Let V and W be vector spaces over F' and suppose S is in Endr(V) 
and T is in Ende(W). Let S ® T denote the map of V ©® W into itself de- 
fined by (v, w) (S ® T) = (vS, wT). Then for v, v’ in V and w, w’ in W, 
we have 

[v,w)+0' w)]S OT) =(vt+v',wtw') (S @T) 
=((v t+ v')S, (w+w’')T) 
=(vS+v'S,wl+wT) 
= (vS, wT) + ('S, wT) 
=(v,w) (S ®T)+ 0"',w') (S ®T). 
Similarly, [a(v, w)] (S ® T) = al(v, w) (S ® T)] for allain FE ThusS ®T 
isin Endr(V ® W). 

Let v;,...,’m bea basis of V and let w,,...,W, bea basis of W. 
Then it is easy to show (see Exercise 13) that (v,, 0),...,(m, 9), 
(O,w,),...,, (O, w,) is a basis of V ® W. Let A be the matrix of S with 


respect to vy,...,Vm and let B be the matrix of T with respect to 
Wi,ee.,Wy.- Lhen 


(v;,0)(S ® T) = (5, 0) = (= Aiv;, ) 
J 
= & A;(v;, 0). 
J 


Similarly, 


(0,w,)(S@T)= X B,(0,w;), 
j 


RATIONAL CANONICAL FORM 387 


so the matrix of S ® T with respect to the basis (v,, 0),..., (Wm, 9), (0, 
wW,),+-+,(0, w,) of V ® W is the block matrix 


A > | 
C= | 
0 B 
We often write C=A OB. 


THEOREM 4. Suppose A is in M,,(F) and B is in M,(F). Let C = 
A ® B. Then the minimal polynomial of C is the monic least common 
multiple of the minimal polynomials of A and B. The characteristic poly- 
nomial of C is the product of the characteristic polynomials of A and B. 


Proof. Let f be in F[X]. Since 


A 0 
ce LD al’ 
0 B 


it is easy to see that f(C) is the block matrix 


| (A) 0 | 

0 f(B) 
Thus f(C) = 0 if and only if both f(A) and f(B) are O. This is equivalent to 
saying f(C) = 0 if and only if f is divisible by the minimal polynomial of A 
and by the minimal polynomial of B. Thus the minimal polynomial of C is 


the monic least common multiple of the minimal polynomials of A and B. 
The characteristic polynomial of C is det(C — XJ). Now 


A —XI, 0 
C—XI= 
0 B-—XI,|\ ' 


where /,, and J, are the m-by-m and n-by-n identity matrices, respectively. 
By Exercise 4.7.18, det(C — XJ) = [det(A — XJ, )] [det(B — XJ,)]. OU 


Now we are in a position to be able to describe the canonical forms 
mentioned previously. Let A be in M,(F), let V = F”, and let T:V—>V be 
defined by vT = vA. Then T is in Endr(V) and A is the matrix of T with 
respect to the standard basis of V. Let A — XJ be equivalent over F[X] to 
the reduced matrix 


fi 


388 LINEAR TRANSFORMATIONS 


where f,,..., Jf, are monic polynomials of positive degree and f; divides 
fi+1, 1 <i<r. Then the module V7 is isomorphic to 


M=(F[X//<f, >) ®8...0 (FLX|/<f, >) 


and the action of J on Vr corresponds to multiplication by X on M. Now 
we can choose a basis of F[X]/< f; > such that the matrix for multiplica- 
tion by X is the companion matrix C; of f;. Putting these bases together, 
we get a basis of M with respect to which the matrix for multiplication by 
X is 


C=C, ®...0C= 


Since V- is isomorphic to M, we can choose a basis of V relative to which 
the matrix for T is C. Thus A is similar to C. We call C the rational canonical 
form for A. By Theorem 4, the minimal polynomial of A is f,, since f 
divides f, for 1 <i < r, and the characteristic polynomial of A is 


(—1)"f, ss f, . 
As an obvious consequence of this result, we obtain the following theorem. 


THEOREM 5 (Cayley-Hamilton Theorem). For any matrix A in M, (FP), 
the minimal polynomial of A divides the characteristic polynomial of A. 


Proof. In the preceding notation, this is simply the observation that 
f, divides (—1)"f, ...f,. O 
Theorem 5 is often stated in the following form. Every matrix in 


M,,(F) satisfies its characteristic polynomial. Let us verify this in a special 
case with n = 3 and F = Z.. Let 


N<5 
O<+B<+3 3904 2142414 2 4 3 
4 21 
44 
2 4 3 
O<+T<+(13)°.=13 
1 0 0 
@) 
0 0 1 


£& 
4 


2) 
}-s 


The characteristic polynomial of B, considered as an element of M,(Z;), is 


RATIONAL CANONICAL FORM 389 


ZNXDET (3 3 19,B),-I1 
4234 


4+42X + 3X? +4X3 =-1+2X — 2X? — X?. To show that f(B) is the zero 
matrix, we compute the powers of B 


U<B2<B ZNMATPROD B 
1 4 0 
0 
0 0 2 

U<B3<B2 ZNMATPROD B 
O01 2 
3 
4 3 1 


Ww 
© 


NO 
}-s 


and compute f(B). 


W\C4xT)+(2xB)+(3xB2)+4xB3 


Oo © 
OO O 
Oo O 


Now let us find the canonical form for the matrix A in M,4(Z,, ) repre- 
sented by 


D<A<4 408 6 11115 7111265 312206 2 


8 6 1 11 
15 7 11 12 
6 3S 3 12 
2 0 6 2 


Reducing A — X/ as before, 


N<+17 

DAZV D*+ZNXREDUCE A1<(4 4 19,A),-€14)°.=14 
1 O O O 0 OO O O 0 OO O 0 0 OO O 0 
0 OO O O 1 O O 0 0 OO O 0 0 OO O 0 
0 0 O 0 0 OO dO 0 15 1 0 0 0 0 O 0 
0 OO O O 0 OO O 0 0 OO O 0 12 9 16 1 


we see that A — XJ is equivalent to the reduced matrix 


390 LINEAR TRANSFORMATIONS 


] 0 0 0 
0 ] 0 0 
D= , 
0 0 1S+X 0 
0 0 0 12+9X+16X7+X? 


so the canonical form for A is the direct sum of the companion matrices of 
15 +X and 12+ 9X + 16X? + X°; that is, 


2 0 0 0 

0 0 ] 0 
C= 

0 0 0 ] 

0 5 8 ] 


Now A is similar to C, so there is a matrix Q in GL3(Z,,) such that C = 
QAQ™. We will use Theorem 6.6.10 to find one such Q. 
The procedure ZVXREDUCE computed two arrays F and S. 


DAZV R 
3 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 9 0 0 
140 0 0 0 0 7 0 0 8 7 0 
14 15 0 9 0 0 811 0 14 16 16 

DAZV 8S 
0 0 0 1 0 0 14 0 0 9 5 0 
1 0 0 10 3 £0 1 8 O 16 5 15 
0 0 0 0 0 0 1 0 0 8 7 0 
0 0 0 0 0 0 0 0 0 1 0 0 


These arrays represent the following matrices over Z,, “ ]. 


3 0 0 
0 
R= 0 0 
14 0 7 S+7X 
9 


14+15X 8+11X 14+16X+16X2 
0 ] 14 9O+5X 
5- 1 =10+3X 14+8X 16+5X+15X? 
- 0 | 1 8+7.X 
0 0 0 


The product R(A — XJ)S is D. According to Theorem 6.6.10, we must 
compute U=S7!. To find U, we use ZVXMATINV. 


RATIONAL CANONICAL FORM 391 


DAZV U<ZNXMATINV § 


7 14 1 @) 3 @) 16 @) 
1 @) @) @) 3 @) 1 8 
@) @) @) @) 1 @) 9 10 
@) @) @) @) @) @) 1 @) 
and find that 
T+14X l 3 16 
y= | 0) 3 1+8X 
0 0) | 9+10X. 
0 0) 0) | 


Let V = (Z,,)* and let T be the endomorphism of V given by multiplica- 
tion on the right by A. The Z,,[X]-module V7 is isomorphic to 


(Zi,[X]/<f >) ® (Z,7[X]/<g >), 


where f= 15 + X and g= 12+ 9X + 16X? + X3. Let v,, 5,3, ¥4 be the 
standard basis of V and for 1 <i < 4 define 

4 

Ww;> > v, U3; 

j= 1 
By Theorem 6.6.10, w, and w, must be O and V7 is the direct sum of the 
Z,7[X]-submodules generated by w3 and w,, which are isomorphic to 
ZiwlX|/<f> and Z,,[X]/<g >, respectively. 

Now v;U;; means the product of v; and U;; evaluated at A. Thus we 
must evaluate all the entries of U at A. Here is one way to do this. Since 
each U;; has degree at most 1, we combine the identity matrix and A into a 
single rank 3 array. 


O<#<2 4 4p(,(14)°.=14),,A 


1 O QO QO 
O 1 0 0 
O O 1 0 
Oo O OQ 41 
8 6 1 11 
15 7 11 12 
6 5 3 12 
2 0 6 2 


Next we form 


K<+U ZNMATPROD da 


392 LINEAR TRANSFORMATIONS 


The array K has rank 4 and XLI;~J7;; ] is the value of the polynomial with 
coefficients ULI ;J7; ] at the matrix A. For example, 


OTO0<1 
YUL33;43 ] 


17|(9xH01;3;])+10xH(23;3] 
Ly 9 10 8 
14 #11 8 1 
9 16 5 1 
3 @) 9 12 
KL33433] 
Ly 9 10 8 
14 11 8 1 
9 16 5 1 
3 @) 9 12 


For any matrix B in M,(Z,7), the product v,B is just the jth row of B. Thus, 
to compute the v;U;;, we want a rank 3 array LZ such that LLI;J; J is 
KlI3;J/3;27;3 ]. This array is 


[<1 2 2 3&K 
Finally, we must sum on J to get the w;, which are given by the rows of 


U<«W<+17|)+/02I)2L 


0 OO OO 0 
0 OO O O 
3 0 10 12 
0 OO O 1 


As promised, w, = w, = 0. Now the Z,,[X]-submodule M of Vr generated 
by w3 is one dimensional. The submodule WN generated by w4 has dimension 
3 and a Z,,-basis w4, W4A, and w,4A?. Relative to this basis, the com- 
panion matrix for g is the matrix for T restricted to N. Thus w3,w4,w4,A, 
w,A” is a basis of V and relative to this basis C is the matrix for T. The new 
basis is given by the rows of @ constructed as follows. 


Q<4 4o0 
Q@CL1 2;J]<WL3 4; ] 
QC3;J<WL4;] ZNMATPROD A 
Q@L4;1<Q@L3;] ZNMATPROD A 
Q@ 

3 0 10 12 

oO O oO 1 

2 0 6 2 

5 8 15 13 


RATIONAL CANONICAL FORM 393 


This is the matrix we have been looking for! 


U<QI<+ZNMATINV @ 


14 9 s) 0 
O 16 5 15 
1 8 7 O 
O 1 O 0 
Q ZNMATPROD A ZNMATPROD QI 
200 0 
0 01 0 
0 0 0 1 
O05 8 1 
EXERCISES 
1 Let C be in M,(F[X]) and let N be the submodule of F[X]” 
generated by the rows of C. Suppose f = det C is not 0. Prove that 
F[X]”/N has dimension deg(f). 
2 Let A be in M,(F) and let f=c, — c,_1X +ce,_3X? —...+ 
(—1)*~!e,X"—-! + (-1)"X" be the characteristic polynomial of 
A. Show that c; is the sum of determinants of the i-by-i diagonal 
submatrices of A, that is, the i-by-i submatrices obtained by se- 
lecting a set of i row indices and the same set of i column indices. 
In particular,c, = det A and c, isthe trace A,, +...+An, Of A. 
3 Show directly from the definitions that similar matrices have the 
same characteristic polynomials. 
4 InM,(Z,3) let 
5 12 9 3 0 6 
A= | 2 10 6 |, B= 1] 10 OO]; , 
5 7 4 2 12 5 
5 0 9 8 2 6 
C= | 5 9 8 , D= 7 0 7 
6 2 4 9 3 1] 
Decide whether A, B, C, D are similar to each other. 
5 Write down the companion matrices of the polynomials 7 — 3X + 
4X2 + X3 and 1—5X+X3+X5 in QLX]. 
6 Let A bea matrix in M.(R) such that A — XJ is equivalent to 


394 LINEAR TRANSFORMATIONS 


& 


where f= X? + 1 and g = (X — 1)* (X? + 1). What is the rational 
canonical form for A? What are the minimal and characteristic 
polynomials of A? 


7 Determine the rational canonical form C for the matrix 


] —] —] —] 


] —] 0 —] 

A= 
—] l —] 0 
] l —] —] 


in M,(Z3). 
In Exercise 7, find Q in GL4(Z,;) such that C= QAQ™. 
9 A matrix A in M,)(R) has minimal polynomial X?(X + 1)? and 


characteristic polynomial X°(X¥ + 1)*. What are the possible ra- 
tional canonical forms for A? 


10 Show that the matrices 


1] 14 l 8 16 6 3 3 

18 5 17 6 7 15 10 3 
A= and B= 

2 10 10 9 10 13 1] 1] 

5 14 16 0 6 18 5 3 


in M,(Zi9) are similar and find Q in GL,4(Z,,) such that B = 
OQO"AQ. 

11 Let f and g be monic and relatively prime in F[X]. Prove that the 
companion matrix of fg is similar to the direct sum of the com- 
panion matrix of f and the companion matrix of g. 


12 LetA beinM,, (F) and B in M, (fF). Show that 


A 0) B 0 
C= | | and D= | | 
0) B 0 A 


are similar and find Q in GL,, +,(F) such that D=Q71AQ. 
13. Let V and W be vector spaces over the same field and let v,,.. 


an 


EIGENVALUES AND EIGENVECTORS 395 


Vm and w,,..., Wn be bases of V and W, respectively. Show that 
(v,,0),.--,%m,9), (0, W,),..., (0, w,) is a basis of V ® W. 


3. EIGENVALUES AND EIGENVECTORS 


In Theorem 7.4.2 we encountered the problem of deciding, for a certain 
linear transformation 7 on a vector space V, which elements v of V satisfy 
vT =y. In this section we will discuss a slightly more general question. 

Let V be a vector space over a field F and let T be in Endr(V). An 
eigenvector for T is a nonzero element v of V such that yT = Av for some A 
in F. We call X the eigenvalue of T associated with v. (The adjective ‘‘eigen’’ 
in German means “own’’ or “characteristic of’’. The terms ‘“‘characteristic 
value’? and “‘proper value’? are sometimes used instead of “‘eigenvalue’’.) 
Note that X} may be O but vy must be nonzero and that v determines A but 
A’ does not usually determine yv. If A is in M,(F), then an eigenvector of 
A is a nonzero element x of F” such that xA = Ax for some A in F. As 
should be expected, the two types of eigenvectors are closely related. 

THEOREM 1. Let V be a vector space over a field F, let TJ be in 
End-(V), and let A be the matrix of JT with respect to a basis vy,;,..., Yn 
of V. Then v in V is an eigenvector of T if and only if the coordinate vector 
x for vy relative tov,,...,V, iS an eigenvector of A. 


Proof. The coordinate vector for vT is xA and v is nonzero if and only 
if x is nonzero. The coordinate vector for Ay is Xx. Thus vT = Av if and only 
ifxA=rdx. UO 


If A is the matrix 


O<A+3 3906 84134 #9190 =3 8 


6 8 13 
4 ~9 19 
0 “3 8 


in M,;(Q),then1 1 1 isan eigenvector of A with eigenvalue 2. 


1 1 1+.xA 


2 2 2 


THEOREM 2. Let A be in M,,(F) and let \ be in F. Then A is an eigen- 
value of A if and only if \ is a root of the characteristic polynomial of A. 

Proof. Suppose x in F” is an eigenvector of A with A as its associated 
eigenvalue. Then xA = Ax = x(AJ), where J is the n-by-n identity matrix. 
Thus x(A — AJ) = 0. Since x #0, the matrix A — \/ cannot be invertible, so 


396 LINEAR TRANSFORMATIONS 


det(A — AJ) = 0. Therefore X is a root of the characteristic polynomial f = 
det(A — XJ) of A. Conversely, if f(A) = 0, then det(A — AJ) =0, and A — NI 
has rank less than n. Therefore the linear system x(A — AJ) = O has a nonzero 
solution. U 


Note that the linear system x(A — J) = 0 in the proof of Theorem 2 
would normally be written (A‘ — N/)x = 0. The set W of solutions of this 
system, including the zero vector, is called the \-eigenspace of A. Let 


U<+A<4 4903 140222142414 303 1 


3 14 0 

222 1 

241 4 

3 03 1 
T<(i4)eo.=14 


in M,(Z, ). From 


N<«s 
ZNDET A1<A-2xI 
0 


we see that 2 is an eigenvalue of A. To find the eigenvectors, we solve the 
homogenous linear system with matrix QA1. 


(841) ZNLSYS 4p0 
0 0 0 0 
W 
111 0 
O10 1 
The eigenvectors of A that have 2 as their associated eigenvalue are the 
nonzero elements of the subspace of (Z; )* spanned by the rows of W. 
It often happens that F is not a splitting field for the characteristic 


polynomial of an element A of M,,(F). For example, the real matrix 


0 ] | 
A= | 
—] 0 
has characteristic polynomial 


—X l 
det | | =1+ YX? 
a 


and 1 + X? has no roots in R. However, if we consider A to be in M,(C), 


EIGENVALUES AND EIGENVECTORS 397 


then (i, 1) and (-—i, 1) are eigenvectors of A with associated eigenvalues 
i and —i, respectively. 

If A and B are similar matrices, then A and B have the same char- 
acteristic polynomials and therefore the same eigenvalues. They do not 
necessarily have the same eigenvectors. However, for each eigenvalue A, the 
A-eigenspaces of A and B have the same dimension. 


THEOREM 3. Let A be in M,,(F). Then every eigenvalue of A is a root 
of the minimal polynomial of A. 


Proof. Let x be an eigenvector with eigenvalue A. Then xA = Ax, xA* = 
(xA)A = (dx)A = X*x and, by induction, we have xA’ = ’x for all nonnega- 
tive integers 7. Thus if fis in FLX], then xf(A) = f(A)x. If f is the minimal 
polynomial of A, then f(A) = 0, so f(A) must be 0. U 

We could also prove Theorem 3 using the description of the charac- 
teristic polynomial immediately preceding Theorem 2.5. Theorem 3 shows 
it is impossible for a matrix to have minimal polynomial X(X — 1) and 
characteristic polynomial X* (X¥ — 1)? (X + 2). 

A diagonal matrix is a square matrix whose entries off the main di- 
agonal are Q. For example, 


l 0 0 0 
0 Nt 0 0 
0 O —2 0 
0 0 0 3 


is a diagonal matrix. Linear transformations are considered particularly 
simple if some matrix for them is diagonal. This suggests the problem of 
deciding whether a given square matrix is diagonalizable, that is, similar to 
a diagonal matrix. 

THEOREM 4. Let \,,..., A, be distinct elements of F and let D be 
the n-by-n diagonal matrix 


Ay 


398 LINEAR TRANSFORMATIONS 


where A; Occurs n; times on the diagonal. Then the minimal polynomial of 
D is (X — X,) --.(X —X,) and the characteristic polynomial of D is 
(Ay — XJ"... (A, — XY”. 

Proof. The matrix D — XJ is a diagonal matrix with n; diagonal entries 
A; — X, 1 <i<-yr. Thus the characteristic polynomial of D, which is det(D — 
XI), is (A, — XJ"! ... (A, — X)"”. Each A; is an eigenvalue of D and so by 
Theorem 3 the minimal polynomial of D is divisible by f= (X¥ — Aj)... 
(X — i,). Now f(D) is D —2A, YN... OD — d,L. Since the first 1, diagonal 
entries of D — X,/ are O, the next n, diagonal entries of D — i,/J are 0, 
and so on, f(D) = O. Therefore the minimal polynomial of D divides f. 
Since f is monic, f must be the minimal polynomial. U 


We can now describe a necessary and sufficient condition for a square 
matrix to be diagonalizable. 


THEOREM 5. Let A be in M,,(F). Then A is diagonalizable in M,(F) 
if and only if the minimal polynomial of A has the form (X —A,)...(X — 
A,), Where A,,...,A, are distinct elements of F. 


Proof. If A is diagonalizable, then A is similar in M, (fF) to a diagonal 
matrix D, which by Theorem 4 has minimal polynomial (X¥ — \,)...(X — 
A,), Where \,,..., A, are the distinct eigenvalues of D. Since A and D are 
similar, A and D have the same minimal polynomial. 

Now suppose A has the minimal polynomial f= (X — A,)... (X — 2,). 
Let V = F” and let T be the element of End-(V) with vT = vA for all y in 
V. The matrix A — X/ is equivalent over F[X] to a matrix 


fi . 
v 


fy 


where the f; are monic polynomials and f; divides f;+,, 1 <i<n. Moreover, 
fr, =f The F[X]-module V7 is isomorphic to the direct sum of the modules 
PLX|/<f; >,1 <i<n. If 1 <i <n, then f; divides f, and so f; = (X —p,)... 
(X — u,), where the yw; are distinct elements of fF. By Theorem 7.4.3, 
F([X]/< f; > is isomorphic to the direct sum of the modules FLX ]/< X — 
u; >, 1 <j <s. This means that Vy is a direct sum of one-dimensional sub- 
modules, V=V, ®...@V,,. If vz, is a nonzero element of V;, then v, 
is an eigenvector of JT, and v,,...,¥, form a basis of V. With respect to this 
basis, the matrix D of T is diagonal. Since A and D are similar, A is diag- 
onalizable. U 


The matrix 


EIGENVALUES AND EIGENVECTORS 399 


L<A<«4 4p0 4 7519724634 0345 7 


O 4 7 9 
19 7 «241 
6 3 4 QO 
3 15 7 


in M,(Z,,) has characteristic polynomial 3 + 1OX + 2X? + X4, 


N<11 
ZNXDET (4 4 10.,4),-2«<(C14)0.=14 
3 1002 1 


which, in Z,,[X], is (X — 2)* (X — 8)*. If f= (X — 2) (X — 8), then f(A) 
is QO. 


(A-2xI) ZNMATPROD A-8xI 


0 0 0 0 
0 0 0 0 
0 0 0 0 
0 0 0 0 


By Theorem 5, A is diagonalizable. As a check, we can compute the 
eigenspaces for \ = 2 and A= 8. 


(QA-2xI) ZNLSYS 400 
0 0 0 0 

U<U<W 
4 2 1 0 
69 0 1 

(QA-8x1) ZNLSYS 4o0 
0 0 0 0 

U<V<W 
3 7 1 0 
22 01 
The rows of U are a basis for the 2-eigenspace and the rows of V are a basis 
for the 8-eigenspace. If 


Oro«1 
U<+P<U,L1I1V 


Nw MD £ 
KO mI WO bo 
OR OF 
re OF © 


400 LINEAR TRANSFORMATIONS 


then P has nonzero determinant, 


ZNDET P 
5 


and so the rows of P are a basis for (Z,,)*. A diagonal matrix similar to 
A is 


U<D<P ZNMATPROD A ZNMATPROD ZNMATINV P 


COO OC Nh 
COON O 
CO wo OO © 
momo Oo Oo 


The diagonal entries of D are just the eigenvalues associated with the rows of 
P. 

In Section 2 we described one canonical form for matrices under 
similarity, the rational canonical form. There is another widely used canon- 
ical form, the Jordan canonical form, which we can now describe. It is based 
on the following simple observation. 


LEMMA 6. Let X be an element of F and let J be the ideal of FLX] 
generated by (X — A)”. Then the matrix for multiplication by X on F[LX]/I 
with respect to the basis 1+/,(X —A) +/,...,(X—A)”™~ 1+ Tis 


rh 1 

hn 1 0 
J= 

0) rh 1 


r 
Proof. The following identity is obvious. 
(X — AX = AX — Al + (XK — A), 
Thus for 0 <i<m-— 1 we have 
[(X —Aj tI X =A KX —AP +I + (KX —A‘*! 47. 
Since (X — A)” isin J, 
[(X —vn™-14+NX=AKX-A™- 4+. T 


The matrix J in Lemma 6 is called a Jordan block. [French mathema- 
tician Camille Jordan (1838-1922) made important contributions in algebra 


EIGENVALUES AND EIGENVECTORS 401 


and in other areas of mathematics.] The minimal polynomial of J is (X — 
A)”. The only eigenvalue of J is A, and the A-eigenspace of J is one dimen- 
sional. 

THEOREM 7. Let A be in M,,(F) and assume all the roots of the char- 
acteristic polynomial of A lie in F. Then A is similar to a direct sum of 
Jordan blocks, which are unique up to order. 


Proof. Let A — X/ be equivalent over F[X] to the reduced matrix 


and let A,,..., A, be the distinct roots of f, in F. Then 
fe=(X-A, J" LK A, 


and mj < mj4+1,j for 1 <i<n. If V= fF" and T in Endf(V) is multiplication 
by A, then V7 is isomorphic to the direct sum of the modules F'[X ]/<f; >, 
which by Theorem 7.4.3 are isomorphic to the direct sums of the modules 
F([X]/<(X — \;)" 7 >. Thus we can write Vr as a direct sum of modules 
Viz, 1 <isn,1 <j <r, such that V,; is isomorphic to F[LX]/<(X — 
\,;)” 7 >. By Lemma 6, we can choose a basis for V;; such that the matrix 
for T| Vij is the m,;-by-m,; matrix 


Aj l 
0 
Ji; = 
0 Aj I 
aN 


Putting these bases together, we get a basis for V such that the matrix for 
T is the direct sum of the J;;. 

Now let us prove the uniqueness of the Jordan blocks. Suppose A is 
similar to a direct sum of m,;-by-m,;; Jordan blocks J;; having 4; on the di- 
agonal, 1 <i<n,1<j<r,whered,,..., A, are distinct elements of F and 
Miz < Mj+1,;, 1 < i <n. Then the module Vy is isomorphic to the direct 
sum of submodules V;; isomorphic to F[X]/<(X — hy)” 7 >. By Theorem 
7.4.3, 


Vi, OVin2 @... OV XS FIXVK< FD, 


where 


402 LINEAR TRANSFORMATIONS 


fp=(X-Ay)A.. KX -A)"”. 
Thus Vr is isomorphic to F[X] /S(D), where 


and A — XJ is equivalent to D. Now each f; is monic and f; divides f;+1 for 
1 <i<n. Therefore D is reduced over F[X] and the f; are the elementary 
divisors of A — XI, Changing either the eigenvalues A; or the sizes m,; of the 
blocks yields a different sequence f, ,...,/,,. Thus the Jordan blocks are 
unique up to order. U 


A matrix that is the direct sum of Jordan blocks is said to be in Jordan 
canonical form. 
Let us find the Jordan canonical form for the matrix 


U<+A+3 39010 9 778 4 114 «11 «411 


10 9 #7 
7 8 4 
11 $11 11 


in M3(Z,3). The characteristic polynomial of A is 1 — 3X + 3X? — X3 = 
(1 —X)>. 


N+13 
ZNXDET (3 3 190,4),-I*«(13)0°.=13 
110 3 12 


Thus the only eigenvalue of A is 1. Since A is not equal to J, we know that 
A is not diagonalizable, so its Jordan canonical form has a block of size 
greater than 1. Since the square of A-J is the zero matrix, 


(A-I) ZNMATPROD A-TI 


OO Oo 
Oo © 
OOo © 


the minimal polynomial of A is (X — 1), and the Jordan canonical form 
for A has no blocks of size 3. Thus the Jordan canonical form must be 


EIGENVALUES AND EIGENVECTORS 403 


U<+B<+3 301000141001 


1 0 0 
O11 
0 0 1 
EXERCISES 
1 Let T be an element of Endr(V), where V is finite dimensional 


over F. Let A,, ..., A, be the distinct eigenvalues of T in F and 
let V; = {ve V|vT = d,r}. Show that V, + V, +...+V,isV, ® 
V, 0... @8YF,,. 

For each field F and matrix A over F find the eigenvalues of A and 
for each eigenvalue find a basis for the corresponding eigenspace. 


(a) F=Zae, 0 1 1 
A=]2 1 4 
2 4 | 
(b) F=Q, ~ -§ 2 4 
A= 5 6 -S 
2 2 1 
(c) F=Z, 12 11 5 5 15 
5 -6 10 15 15 
A= | 14 4 2 13 #13 
_4 7 2 13 0 
—10 2 1 14 2 


Which of the matrices in Exercise 2 are diagonalizable? 


Let J be a Jordan block in M,,(F) with eigenvalue X. For i = 
1,2,...,let V; be the set of vectors x in F” such that x(J — N)* = 
0. Show that V; has dimension i for | < i < m and dimension m for 
i>m. 

Let A be in M,(F) and assume all eigenvalues of A lie in F. For 
each eigenvalue A and each positive integer i, let V;(A) be the set of 
vectors x in F” such that x(A — AZ) = 0. Show that the number of 
Jordan blocks with eigenvalue A in the Jordan canonical form for 
A having size at least m in dimrV,,(A) — dimrV,,_ ,(A). [Set 
Vo(A) = {0}.] 


404 LINEAR TRANSFORMATIONS 


6 For each field F and matrix A over F find the Jordan canonical 


form for A. 
(a) F=Z., 6 6 3 
A=] 5 6 l 
4 l 4 
(b) F=Zi1, 3 7 l 8 8 
5 7 2 l 10 
A= | 9 10 l 7 5 
7 8 2 4 4 
9 8 l 10 3 
(c) F=Q, —2 O -|l 
A= | —-8 2 —2 
] 0 0 


7 For each matrix A in Exercise 6 find an invertible matrix Q such 
that QAQ™ is in Jordan canonical form. 


____ Appendix 1 a 
THE APL LANGUAGE 


This appendix concentrates mainly on APL as a formal system of mathema- 
tical notation. The aspects of the language related to its implementation in 
terminal systems are discussed primarily in Appendix 2. We will, however, 
give a brief description of the operation of an APL terminal system so that 
readers can follow the sample dialogues presented in the text and reproduce 
them at a terminal. We will also show how to combine APL statements into 
procedures that can be executed on a terminal system. 

The treatment of APL presented in this book is self-contained and goes 
deeply into the parts of the language that are likely to be of use to readers 
in solving the algebraic problems discussed. Certain aspects of APL are not 
covered. These include the system functions, some of the system variables, 
terminal input and output, the CONTINUE workspace, and accessing files. 
Individuals wishing to fill in these gaps should consult one or more of the 
books on APL listed in the bibliography. 

It is now possible to work with APL on small, one-user computers. 
However, we will assume here that the APL system is running on a large 
computer that serves many individuals simultaneously. Before attempting 
to use a particular terminal system, one must obtain an account number for 
that system. Armed with an account number and having found a terminal 
appropriate to the system, one must connect the terminal to the computer 
and identify oneself by the account number. This is called signing on. 
The details of the sign-on process vary, depending on the version of the 
terminal system and the type of terminal being used. The computer center 
providing the APL service can supply information about signing on. 

Once sign-on has been completed, the carriage or cursor of the terminal 
is positioned six spaces to the right of the left margin. At this point, one is 
free to type in an APL statement describing a computation to be performed. 
The statement is terminated by a carriage return. Now the computer begins 
to process the statement. In some cases results of the computation will be 
printed at the terminal. If the statement contains an error that makes com- 
plete processing impossible, an error message is printed. Any typing initiated 
by the computer begins at the left margin. When processing of the first 


405 


406 THE APL LANGUAGE 


statement has finished, the carriage or cursor is indented six spaces on a 
new line. A second APL statement may now be entered. 


APL KEYBOARD 
(EGER EE GREE & 
3 | 4 6/[/7/1l[se{,9]fol|lr+ SPACE 
olelslelsitlélilslel zie 
W TrlLyluitdrs Pile fF 
RETURN 
AIS I|DIFEF H|d L } 
c/l/DinijiulLsl]T \ 
SHIFT 
a Eeaeahaaeb 


SPACE 


The layout of the keyboard varies from terminal to terminal. The lay- 
out of one terminal is given here. Readers should familiarize themselves with 
the keyboard of the terminal available to them. 

In the description of the APL language that follows, no serious attempt 
has been made to anticipate every possible question. In almost all cases the 
best way to obtain the answer to “What would happen if. . .?”’ is simply to 
enter the particular construction at a terminal and see what happens. Be- 
ginning students of APL should let their terminal systems be the final arbiter 
concerning all questions about the APL language. 


1. A SAMPLE TERMINAL SESSION 


Let us imagine that we are seated in front of an APL terminal and that we 
have completed the sign-on process. We try a few simple arithmetic prob- 
lems, entering the problems one at a time, signaling the end of the line with 
a carriage return and waiting for the computer’s response. (To avoid wasting 
space on the page, the following listing has been broken in the middle and is 
presented in two columns. At the terminal, it would appear as a single long 
column.) 


4+3 43 
7 1.333333333 
4-3 4 x 3 
1 64 
4Ux3 3*4 
12 81 


It should be clear that the symbols +, -, x, and + represent the usual 


A SAMPLE TERMINAL SESSION 407 


arithmetic operations and that x is used for the operation of exponentiation 
Or raising to a power. Here are some more examples using two operations in 
the same statement. 


4+3-2 4x3+2 


4+3x2 4-3+2 
10 1 


The results of the last two computations may seem incorrect. This is 
due to the order in which the operations are performed. Consider the third 
statement, 4x3+2.In normal mathematical notation, multiplication is said 
to take precedence over addition. Thus the product of 4 and 3 would be 
computed first and the result added to 2, giving 14. However, in APL there 
are sO many basic or primitive operations that it was found to be impossible 
to construct useful precedence rules. Instead, the following simple rule was 
adopted. 


The Right-to-Left Rule. Unless otherwise indicated by parentheses, the 
Operations in an APL statement are performed from right to left, each oper- 
ation taking as its right argument everything to the right of it. 


In the statement 4x3+2 the rightmost operation + is performed 
first. Thus 3 and 2 are added and the result multiplied by 4, giving 20. The 
order of performing the operations can be changed using parentheses in the 
usual way. 


(4x3 )+2 
14 


Now consider the other troublesome statement, 4-3+2. Again, the 
rightmost operation is performed first. Thus 3 and 2 are added and the re- 
sult subtracted from 4. The expected result is - 1 but we find 1 instead. 
The Right-to-Left Rule forces the symbol used when writing a negative 
constant to be different from the symbol - used to denote the operations 
of subtraction and negation. Note the difference in the following two state- 
ments. 


-4+3 ~ U+3 


7 1 


In the first case we wish the negative of 4+3, and in the second we are 
asking for the sum of 4 and 3. 


408 THE APL LANGUAGE 


The definition of the APL division operation + differs slightly from the 
usual definition of division of real numbers. Ordinarily, division by O is never 
permitted. However, in APL the expression 00 is allowed and is given the 
value |. 


070 


The many primitive APL operations can be classified into two types, 
according to the number of arguments they require. An operation with one 
argument is called monadic or unary and the symbol for that operation is 
always placed to the left of its argument. The only monadic operation we 
have encountered so far is the negation operation in the statement -4+3. 
An operation with two arguments is called dyadic or binary and the symbol 
for the operation is placed between the arguments. We have seen five dyadic 
Operations: addition, subtraction, multiplication, division, and exponenti- 
ation. In order to keep the number of symbols required to write APL state- 
ments to a minimum, the same symbol is often used to represent a monadic 
Operation and also a related dyadic operation. No ambiguity can arise, 
Since a symbol being used monadically will have either nothing or another 
operation to its left. Thus in the statement 

-1--2 


3 


the first and third minus signs represent monadic negation while the second 
minus sign represents dyadic subtraction. 

It is often convenient to save the result of a computation in the com- 
puter instead of having it printed at the terminal and forgotten. This is done 
by assigning the result to a variable using the assignment arrow <. 


X<*3+4 Q<XxX+Y 
x Q@ 
7 67.52025918 
Y<X*x0.5 X<X+1 
Y x 
2.645751311 8 


Typing the name of a variable and a carriage return causes the current value 
of that variable to be typed at the terminal. The Right-to-Left Rule applies 
to « so that in the preceding example X<«X+1, 1 is added to the current 
value of X and then this new value is assigned to X. 

Often we will want to assign a value to a variable and, at the same 


A SAMPLE TERMINAL SESSION 409 


time, display that value at the terminal. This can all be done in one state- 
ment. 


[]J<Z<24+11x7-4 
35 


Assigning a value to the “‘quad’”’ symbol L) indicates that the value is to be 
printed at the terminal. 

Any relatively short sequence of letters, underlined letters, and digits 
is a valid variable name provided that the sequence starts with a letter or an 


underlined letter. No blank spaces are permitted within a variable name. 
Thus 


ABC SUM 
Xi C12B3 
SUM GREATESTCOMMONDIVISOR 


are all valid names. A letter and the same letter underlined are considered 
to be different symbols, and so SUM and SUM are different names. 

Several assignments may appear in one APL statement. The Right-to- 
Left Rule governs the order in which these assignments are made. 


I<J<K<«0 M<(A<4 )xB<5 
I B 
0 is) 
J A 
0 4 
K M 
0 20 


Numbers with very large or very small absolute values may be entered 
using scientific notation. 


AVOGADRO<6 . 022F23 PLANCK+6.626E 34 


Here we have entered Avogadro’s number 6.022 X 10?% and Planck’s con- 
stant 6.626 X 107%“. Scientific notation may be used by the computer 
for output. 


7T*S1 
1.957775382426 


410 THE APL LANGUAGE 


EXERCISES 
1 Carry out at a terminal all of the dialogues presented in this sec- 
tion. 


2 Evaluate each of the following APL expressions on paper and then 
check your answers by entering them at a terminal. 


(a) 7+12 (h) 0x0 (0) 6-4-2 
(b) 8-3 (i) 2*0.5 (p) 8x4+2 
(c) 2-9 (j) 3* 3 (q) 8=4x2 
(d) -5 (k) 6+4-2 (r) 87442 
(e) O00 (l) 6-4+2 (s) Ox0*0 
(f) 2*4 (m)6+ 4+2 (t) 5- 2 
(g) 4x2 (n) (6-4)+2 (u) 5-4-3-2 


3 What would be the final values of X, Y, and Z after the statement 
Z<( Y*<2+Z)xX<3+Z*<4xX<l 


is entered at a terminal? 


2. ARRAYS 


An APL statement may involve three kinds of objects: arrays, primitive op- 
erations, and defined procedures. This section is devoted to arrays. Defined 
procedures are discussed in Section 4; the remaining sections of this ap- 
pendix deal with the various types of primitive operations. 

The three most common types of arrays in APL are scalars, vectors, 
and matrices. All of the constants and variables that we used in the previous 
section were scalars, that is, single real numbers. A variable may also be a 
vector. 


O<+V<+1 235 8 13 
123 5 8 13 


The components of a vector of constants are separated by spaces. The 
Ith component of V is denoted VL I J, and I is referred to as an index. 


VL2] V£1 3 5] 
2 1 3 8 

VL5] VL2 4 6] 
8 2 5 13 


In the last two examples a vector was used as an index to obtain a vector 
of components of V. 


ARRAYS 411 


The dyadic primitive operation po lets us reshape a vector into a matrix. 


+A<2 307 O+B+<3 20V 
1 2 3 1 2 
5 8 13 3. «5 
8 13 


In each of the preceding examples the first argument of 0 is a vector giving 
the number of rows and columns the matrix is to have. The matrix is built 
up from the second vector argument a row at a time. This reshape operation 
is discussed more fully in Section 5. 

Two indices are required to specify a particular entry in a matrix. 
(One usually refers to the components of a vector. The term “‘entry”’ is used 
with matrices and more complicated arrays and may also be used with 
vectors. ) 


AL13;2] BL33;1] 


The two indices of a matrix are separated by a semicolon. An entire row or 
column of a matrix may be obtained by leaving the appropriate index un- 
specified. 


Al23] BL31] 
5 8 13 1 3 8 


If two vectors are used as indices of a matrix, then the result is a 
matrix. 


AL1 232 3] 


2 3 
8 13 


Entries in an array can be changed using the assignment arrow. 


B BL2 33;1]<1 0 
1 2 B 
3. «5 1 
8 13 1 5 

B{1;21]« 5 0 13 


The rank of an array is the number of indices required to specify a par- 
ticular entry in the array. Thus scalars, vectors, and matrices have rank 


412 THE APL LANGUAGE 


O, 1, and 2, respectively. APL permits the definition of arrays of any finite 
rank. For example: 


[«c<+2 2 2p1 234 567 8 


CL13;23;1] 


CL31;3] 


Here C is an array of rank 3. Note that C is displayed at the terminal as the 
two matrices CL1;;JandCl2;; J]. If 


U<I<1 L<+K<+1 101 
1 1 

O<+J<101 O<Z<1 1 101 
1 1 


then each of the arrays J, 7, K, and LZ has one entry, but J is scalar, -¢ is 
a vector of length 1, K isa 1-by-1 matrix, and Z has rank 3. 

The shape of an array is a vector that gives the number of different 
values each index of the array may have. The monadic operation p gives the 
shape of its argument. 


oV oc 
6 22 2 

oA 0AL 31] 
2 3 2 

oB oVL1 3 5] 
3 2 3 


The components of the shape of X are sometimes called the dimensions 
of X. The length of pX is the rank of X, which is therefore ppX. 


peV pec 
1 3 

peA ppl 
2 @) 

ppB 


ARRAYS 413 


If X is scalar, then X has rank O and pX isa vector of length O, that is, a 
vector with no components. 


o1 
<——————— (blank line “‘printed’”’ at the terminal) 


The vector V has six components and we have been indexing V by ele- 
ments of the set {1, 2, 3, 4, 5, 6}. There are times when it is more con- 
venient to index V by elements of the set {0, 1, 2, 3, 4, 5}. The smallest 
value an index may have is called the index origin and is denoted OJO. 
The index origin is one of a class of variables called system variables, which 
are described in Section A2.2. The origin may be 1 or O and can be changed 
by a normal APL assignment. During the sign-on process of an APL terminal 
system, the origin is automatically set to 1. 


V HITO0<0 
123 5 8 13 VL2) 
VL2] 3 
2 V£L1 3 5] 
VE1 3 5] 2 5 13 


The origin affects the indexing of matrices and other arrays as well. 


A AL1;2] 

1 2 3 2 
5 8 13 UTO<+0 
OIT0<+1 AL1;2] 

13 


The monadic operation 1 is called the index generator. If NV is a non- 
negative integer, then 1 is the vector of possible indices for a vector of 
length V. This vector depends on the index origin. 


UI O<1 (iZ0<0 

13 13 
1 2 3 O 1 2 

15 ite) 
123 4 5 O12 3 4 


The vector 10 is the empty vector. 

Most of the arrays we will be using will have entries that are real num- 
bers. However, APL does allow the formation of character arrays, that 
is, arrays whose entries may be any symbol on the APL keyboard. Individ- 


414 THE APL LANGUAGE 


ual characters and vectors of characters are enclosed in single quotation 
marks when entered at a terminal. For example: 


U<#<'=! oNAME 

oof op NAME 
0 

U<VAME<'NAME'! NAMEL1 2 3 O] 
NAME AMEN 


Here £ is a scalar and NAME is a vector whose components are the letters 
NV,A,M,and E. 

An array of any rank may be used as an index provided that all the 
entries in the array are valid indices. 


oC OITO<1 
22 2 D<«CLP3Q;R] 
P<1 2 1 oD 
U<+@<2 2901 22 1 3.9222 1 2 
1 2 DL33;2313;13;1;2] 
21 3 
R<2 1 20111 2 CLPL333;QL23:1)3;R0£131:;2)] 
of 3 
21 2 
EXERCISES 


1 Carry out at a terminal all of the dialogues presented in this section. 


2 Let A<«2 3 5 7 11 and B+3 1 4. Evaluate each of the follow- 
ing expressions on paper, first assuming the index origin is ] and 
then assuming the origin is 0. Check your answers at a terminal. 


(a) AL4] (d) 17 (g) 0110 
(b) BL2 1] (e) pA (h) ppAL2] 
(c) ALB] (f) peB Gi) AL1+O070) 


3 Suppose the statements 


B<15 
EC5 31 3 5J«15 


are entered at a terminal with LJZO set to 1. What would you ex- 
pect the resulting vector F to be? Check your answer at a terminal. 
(Different APL systems may give different results.) 


PRIMITIVE SCALAR OPERATIONS 415 


4 Enter the following matrices at a terminal. 


A x 
2 7 1 1 1 0 
3 #2 =5 o1t1 
B 00 0 
6 2 0 
nh 41 8 


5 Let A, B, and X be as in Exercise 4. Evaluate each of the following 
expressions on paper, first assuming an origin of | and then an 
origin of 0. Check your answers at a terminal. 


(a) AL13;1] (d) XL;2] (g) pAL1; ] 
(b) XL23;1] (e) 0B (h) pp BL 31] 
(c) BL13] (f) pox (i) 93 40112 
6 Let UZ0<1 and C+'ABEFGILNRSU. '. Evaluate the following. 
(a) eC 
(b) ppc 


(c) Cl1 7532911361013 4 11 812] 


7 Suppose that the shapes of the arrays A, B, C, D, are 3 2 5, 
2 3, 4 7 3, and 2 4 6 8, respectively. For each of the follow- 
ing arrays, give the rank and the shape, assuming each is defined. 


(a) ALB;C;3D] (d) CLB31;34] (g) BLB;3 ] 
(b) ALB3D;] (e) CLB3C3A] (h) BL 3B] 
(c) BLA;D] (f) DI3s33] (i) AL3;CLD3B;4];34] 


3. PRIMITIVE SCALAR OPERATIONS 


There are many primitive operations in APL. However, we will require only 
a small fraction of these operations in our study of algebra. In addition, 
the APL operations fall naturally into classes consisting of operations with 
similar properties, making the job of learning about the operations rela- 
tively easy. In this section we will describe some scalar operations, oper- 
ations that take one or two scalar arguments and produce a scalar result. 

First, we will discuss a class of operations that includes the familiar 
operations of arithmetic we used in Section |. Table 1 lists all of the monadic 
arithmetic operations. 


416 THE APL LANGUAGE 


Symbol 


Negation 
Absolute value 
Floor 

Ceiling 

Signum 
Reciprocal 
Factorial 
Exponential 
Natural logarithm 
m times 


x 1r7- — | 


O ® * o=e efe 


Table 1. The Monadic Arithmetic Operations 


The first two operations, negation and absolute value, should be fa- 
miliar. 


to X. 


_ L 2.1 0 
3 0 


The ceiling | X of a real number X is the smallest integer greater than 
or equal to X. 


[2.1 [0 
2 0 


The signum xX of areal number X is1,0,or 1, according to whether 
X is positive, zero, or negative. 


x2 x 2 


x 0 X3.95 


PRIMITIVE SCALAR OPERATIONS 417 


The reciprocal +X of X is 1#X. 


4 + 3 
0.25 ~~ 0.3333333333 


If WV is a positive integer, then the factorial !N of W is the product of 
the integers 1, 2,.. .W.In addition, !0 is defined to be 1. 


13 '6 
6 720 

ty 0 
24 1 


The symbol ! is made by overstriking the single quote ' with a period. 

The last three operations in Table 1 are not required in this book but 
are included for completeness. The value of *X is e raised to the power 
X. The value of ®X is the natural logarithm of X. The symbol ® is made by 
overstriking O and *. The value of OX is am times X. 


x1 @x 3 
2.718281828 3 

x2 O1 
7.389056099 3.141592654 

@10 O2 
2.302585093 6.283185307 


The dyadic arithmetic operations are listed in Table 2. 


Symbol 
+ Addition 
- Subtraction 
x Multiplication 
+ Division 
* Exponentiation 
L Minimum 
[ Maximum 
| Remainder 
! Binomial coefficient 
® Logarithm 
O Trigonometric, etc. 


Table 2. The Dyadic Arithmetic Operations 


418 THE APL LANGUAGE 


We have already encountered the first five dyadic arithmetic oper- 
ations in Section 1. If X and Y are real numbers, then XL Y is the smaller 
of the two and X| Y is the larger. 


3L4 37 4 
3 mM 

3L 4 ~3f 4 
~ oy ~3 

1.2.0.7 1.270.7 
0.7 1.2 


The definition of the remainder operation is a little more compli- 
cated. Let X and Y be real numbers. The X remainder of Y, denoted by 
X|Y, is a number of the form Y+WxX, where JN is an integer. If X is O, 
then Y+NxxX is Y for any W and so 0| Y is Y. If X is not 0, we choose the 
correct number of the form Y+NxxX as follows. If X is positive, then X | Y is 
the smallest number of the form Y+N~xX that is greater than or equal to 0. 
If X is negative, then X|Y is the largest number of the form Y+NxX that 
is less than or equal to 0. 


3/4 0/3 
1 3 

u | 3 Oo| 3 
3 3 

3)11 1.5|5 
2 0.5 

~~ 3f11 4.545 
4 4 


Let M and WV be nonnegative integers with ™ less than or equal to WV. 
The binomial coefficient M'N is ('!N)+(!N-M)xI!M and is the coefficient 
of XV- MYM in the polynomial (X¥ + Y). It is also the number of M-ele- 
ment subsets in a set with WV elements. The traditional notation for M! WM is 


Ci) - 


96 1 


The value of X@Y is the logarithm of Y to the base X. 


PRIMITIVE SCALAR OPERATIONS 419 


108100 4@32 


The dyadic operation o takes as a left argument an integer J, which 
must have absolute value at most 7. The value of [OX is given in the fol- 
lowing table. 


I 

0 (1-X*2)*0.5 (1-X*2)*0.5 
1 Sin X Arcsin X 

2 Cos X Arccos X 

3 Tan X Arctan X 

4 (1+X*2)*0.5 (~14+X*2)*0.5 
5 Sinh X Arcsinh X 

6 Cosh X Arccosh X 

7 Tanh X Arctanh X 


We will now consider two other classes of scalar APL operations. In 
carrying out an algorithm it is often necessary to do different things, de- 
pending on whether a particular condition is true or false. In order to be 
able to handle such decisions within APL, the numbers | and O are used to 
represent ‘‘true’’ and “false’’, respectively. Several primitive operations 
have been included in the language to allow«the formulation of conditions 
whose truth or falsity can be tested. The use of numbers to represent logical 
values permits arithmetic operations to be performed with them. The use- 
fulness of this will be seen later. “ 

The six relational operations <, <, =, =, >, and # each take two real 
arguments and have the value | or O, according to whether or not the indi- 
cated relation holds. Thus X2Y has the value | if X is greater than or equal 
to Y and O otherwise. Here are some examples. 


ue<5 7>6 
1 1 

u<3 10#10 
@) @) 

10=11 5<6<7 
@) @) 


The last example may be puzzling. The Right-to-Left Rule applies to all APL 
statements and all operations. Thus the expression 6< 7 is evaluated first. Its 
value is 1. Then 5< 1 is evaluated, giving O. 

There are several logical operations that take only O and | as arguments. 


420 THE APL LANGUAGE 


The most important are ~, a, v, the not, and, and or operations, respec- 
tively. The first is monadic while the other two are dyadic. 


~0 1A 
1 1 
~1 1v0 
0 1 
~YO=3 OVO 
1 0 
1A0 (5<6 )A6<7 
0 1 


Sometimes ~X is referred to as the logical negation of X. 

There are two other dyadic logical operations. They are * and ¥, the 
nand and nor operations, respectively. The definition of XxY is ~XAY, 
while X¥Y means ~XVY. The symbols for these operations are made by 
overstriking A and v, respectively, with ~. 

There is one more monadic scalar operation that may be of use. It is 
denoted by ? and is called roll. If WV is a positive integer, then ?W is a ran- 
domly chosen integer from 1/. 


? 4 210 


210 ? 4 


Note that repeated evaluations of ?N with the same value of WV produce 
different results. 

All of the primitive scalar operations in APL are extended to arrays of 
arbitrary rank in a very natural way. In calculus we learn to add vectors by 
adding corresponding components. This is the way vectors are added in APL. 


X*2 3 17 5 10 X+Y 
Y¥<1i 3 4 5 8 10 3.6 5 12 13 20 


In fact, every scalar operation is defined for vectors in this componentwise 
manner. 


PRIMITIVE SCALAR OPERATIONS 421 


xX-Y 2X 

10 32 30 111455 4 
xXxyY O<C<xX=Y 

29 4 35 40 100 010001 
-X O<D<Y¥<XxXx 

“2 3 1 #7 +5 10 11014141 
XLY ~C 

1315 5 10 10144110 
X\Y CvD 

1005 30 1101141 
'X CAD 

2 6 1 5040 120 3628800 o1000t1 


If f is a monadic scalar operation and g is a dyadic scalar operation, 
then for any arrays A and B the expressions fA and AgB are evaluated entry 
by entry. Here are some examples with matrices. 


O+A<2 20132 4 


“41 3 
2 4 
O<B<2 2014 
1 2 
3 4 
| A 
1 3 
2 4 
A+B 
0 5 
5 0 
AXxB 
4 6 
6 16 
A<B 
1 0 
1 1 
AxB 
4 Q 


8 256 


422 THE APL LANGUAGE 


Note that AxB is not the matrix that is usually called the matrix product 
of A and B. The way to describe the matrix product in APL will be dis- 
cussed in Section 7. 

In general, AgB is defined only when A and B have the same rank and 
the same shape. There is one exception, however. If one of the arrays has 
just a single entry, then that array is automatically expanded to an array of 
rank and shape agreeing with the other argument. 


xX X-11 
23 17 5 10 12064 9 
X+1 2xA 
3 4 2 8 6 11 2 6 
X*2 4 §68 
4 941 49 25 100 
X24 
00011 1 


The roll operation is also extended to arrays in an entry-by-entry man- 
ner. 


22 4 6 8 
1 3 2 4 


The name of the roll operation comes from the fact that it is possible 
to simulate the rolling of dice by executing 76 6. 


26 6 26 6 


One can create random matrices of zeros and ones as follows. 


HT0<0 2D 
O<+D<+2 302 00 1 
22 2 1 1 0 
22 2 
EXERCISES 


1 Carry out at a terminal all of the dialogues presented in this section. 


2 Evaluate each of the following APL expressions on paper and then 
check your answers by entering the expressions at a terminal. 


DEFINED PROCEDURES 423 


(a) x 3.7 (h) 2.97 3.1 (0) 3F4L2 
(b) 0.05 (i) 417 (p) +22 
(c) | 3.7 (j) ul 7 (q) 8*+3 
(dq) L 3.7 (k) ~ 417 (r) 34x74 
(e) [ 3.7 (Il) 2.5/7.1 (s) 2) 416 
(f) '4 (m)2!4 (t) 5|| 7 
(g) 2.9L3.1 (n) 3L 41 2 (u) 2!!3 
3 Show that for any real number X the value of |X is the same as 
XxxX, 


4 Explain why forming LX¥+0.5 amounts to rounding X to the near- 
est integer. 


5 Evaluate the following expressions as in Exercise 2. 


(a) 3<4 (g) 1vO (m)o=4|12 

(b) 1< 1 (h) oA1 (n) (5*2)=5x5 

(c) 7=8 (i) ~1A1 (0) 120=!5 

(d) 222 (jj) ~3= 2 (p) 1=2x+2 

(e) 5>6 (k) (1>0)v6<7 (q)1>0A6<7 

(f) 34 2 (I) (343)A1>0 #8 (r) 2*1+0v 24+L3.5 


6 Let A+2 3 5 7 11, B+3 1 4, and ULO<1. Evaluate each of 
the following expressions on paper and check your answers at a 


terminal. 

(a) At+2 (d) BJ 2 4 6 (g) 1+10 

(b) Ax15 (e) 1+11 (h) [ (18)+2 

(c) B<2 (f) o1+11 (i) 0=(112)|/12 


7 Construct a vector 7 whose components are the first 100 odd num- 
bers in order. 


4. DEFINED PROCEDURES 


Not all computations can be performed easily by entering APL statements 
one at a time at the terminal. Often we will want to do the same basic calcu- 
lation many times with different values of one or more of the variables. In 
this case it is more convenient to give the system a complete list of all the 
statements to be executed and refer to the entire collection by a single 
short name. This is known as defining a procedure. (The word normally 
used in books on APL is “function”. However, some APL procedures are 
not functions as we define the term in Chapter 1.) 

We begin with a very elementary example. Suppose we have the fol- 
lowing right triangle. 


424 THE APL LANGUAGE 


A 


If A and B are known, then by the Pythagorean Theorem the value of C 
may be computed by 


C+((A*2)+Bx2)*0,5 


Assuming that we have to calculate the lengths of the hypotenuses of many 
right triangles, it is convenient to define a procedure called 7Y POT as fol- 
lows. 


VC<A HYPOT B X*+12 HYPOT 5 
[1] C<+((A*2)+B*2)*0.5 X 
[2] V 13 

3 HYPOT 4 12 HYPOT 3 HYPOT 4 
5 13 


We have been in what is called the execution mode in which statements 
are executed as soon as they are entered. The symbol V (read “‘del’’) indi- 
cates that we wish to enter the definition mode in which statements are 
saved for execution at some later time. The remainder of the line follow- 
ing the V is called the header of the procedure. It gives the following in- 
formation. 


1. The name of the procedure. (The rules for naming procedures are 
the same as those for naming variables.) 


2. The names of the variables used within the procedure to repre- 
sent the arguments, if any. 


3. The name of the variable used within the procedure to represent 
the value, if any, returned by the procedure. 


In our example the name of the procedure is 7Y POT and it requires 
two arguments, A and B. Following the convention described earlier for 
primitive APL operations, the name of the procedure is placed between its 
two arguments. The procedure is to return a value that is referred to within 
the procedure by the name C. 

After the header has been entered, the system prompts by typing 
L1J, indicating that the first line of the procedure is to be entered. After 
that line has been typed, the system prompts with L 2 ] for the second line. 


DEFINED PROCEDURES 425 


In our case there is no second line, so we type another V to tell the system 
that the definition is complete and we wish to return to the execution mode. 
The procedure is now available for use. Note that the Right-to-Left Rule 
applies to defined procedures as well as to primitive operations. 

A defined procedure may have zero, one, or two arguments and may 
or may not return an explicit value. Thus there are six basic forms for the 
header of a procedure. 


VPROC VZ<+PROC 
VPROC Y VZ+PROC Y 
VX PROC Y VZ+X PROC Y 


Here the name of the procedure is PROC, the variables X and Y represent 
arguments, and the variable Z represents a value returned by the procedure. 

The following dialogue illustrates how the variables used in the header 
have meaning only within the procedure. 


A+10+C<« 7 A 
B 3 
VALUE ERROR C 
B 7 
A B 
D<«8 HYPOT 15 VALUE ERROR 
D B 
17 A 


Here we define two variables A and C and check that B has no value as- 
signed to it. (When we enter the name B, the computer responds with the 
message VALUE ERROR, retypes the line, and types a caret below B.) Next 
we use 7Y POT to compute D. Within 7Y POT the values of the arguments 
8 and 15 are assigned to variables A and B and the value 17 is assigned to 
C. However, after the calculation of D, we find that the values of A and 
C are unchanged from their values prior to the use of .7Y POT and that no 
value is assigned to B. The variables A, B, and C in the header of HY POT 
are treated as separate variables, unrelated to other variables with the same 
names that may be defined. 

Variables that have meaning only within a procedure are called local 
variables. They may have the same names as variables already existing out- 
side the procedure. The arguments of a procedure and the value returned by 
the procedure are local variables. Additional local variables may be used 
within a procedure. They are listed in the header preceded by a semicolon 
and separated by semicolons. Procedures may also use global variables, vari- 


426 THE APL LANGUAGE 


ables that may have been assigned values before execution of the procedure 
begins and that continue to exist after completion of the procedure. A pro- 
cedure may even create new global variables. 

The following examples illustrate the use of local and global variables. 
Suppose we need to evaluate the expression 


(x? +a*) + (x3 +a*)? +3 


for various values of a and x. We start by defining a procedure F to do the 
evaluation. The expression x? + a? occurs twice so it is reasonable to com- 
pute it first. 


VY<A F X;B 1 F 3 
C1] B<+(X*3)+A*2 22739 
[2] Y¥<3+(B+1)xBx2 10 F 4 
[3] V 44U37843 

1 F 2 10 F 5 
813 11441253 


The variable B used in F is a local variable. 

Suppose, however, that we knew the value of a will remain the same for 
a large number of values of x. We might then proceed differently, defining 
the procedure G as follows. 


VY<G X3B A<10 
L1] B<+(X*3)+A*2 G 4 
[2] Y¥<3+(B+1)xBx2 4437843 
[3] V G5 
A<1 11441253 
G 2 G 100 
813 1.000301030F18 
G 3 


The variable A in G is now a global variable whose value must be assigned 
before G is used. 

Once a procedure has been defined on a terminal system, the procedure 
may be listed or modified. Information concerning the editing of defined 
procedures may be found in Section A2.1 and in APL reference manuals. 

A defined procedure may use or invoke other procedures, as illustrated 
in the following somewhat artificial example. 


DEFINED PROCEDURES 427 


VZ+X SUM Y VC+A MAX B 
[1] Z<+X+Y [1] C<+(A SUM B)[A PROD B 
[2] V [2] Vv 
VZ+X PROD Y 1 MAX 2 
C1] Z<XxJY 3 
[2] V 3 MAX 4 
12 


In fact, as we will soon see, a procedure may even invoke itself! 

The lines of a procedure are normally executed in the order of their line 
numbers. Sometimes it is necessary to change this order by branching. A 
branch is indicated by a right-pointing arrow: >. The two basic kinds of 
branches are illustrated in the following procedure SUMOFSQUARES, 
which computes the sum of the squares of the first NV positive integers. 


VS+SUMOFSQUARES N3I SUMOFSQUARES 3 
C1] S<+0 14 
[2] T<1 SUMOFSQUARES 10 
[3] +(N<I)/0 385 
[4] S<+S+I*2 SUMOFSQUARES 0 
[5] T<«I+1 0 
[6] +3 
L7] Vv 


This procedure is intended only as an example of the use of branching 
and not as a model for constructing other APL procedures. In spirit, it is 
really a FORTRAN program. As we will learn in Section 6, the whole pro- 
cedure SUMOFSQUARES can be replaced by a single short APL statement. 
Also, it is well known that the expression (Nx(NW+1)x1+2xW)+6 gives 
the value of the sum of the first V squares. 

The procedure SUMOFSQUARES contains two branches. Line 6 indi- 
cates an unconditional branch to line 3, while line 3 contains a conditional 
branch, Although it is possible to indicate conditional branches in many 
ways in APL, perhaps the most common is the following. 


+ (condition)/number 


Here ‘‘condition”’ represents an expression with a logical value, 1 for true 
or 0 for false, and ‘“‘number’’ is a line number. The statement specifies that 
a branch is to be made only if the condition is true. Thus line 3 of 


SUMOFSQUARES 


may be read “if W is less than J, then branch to line 0’’. Of course, there is 


428 THE APL LANGUAGE 


no line 0. A branch to a nonexistant line terminates execution of the pro- 
cedure. 

The variable J runs through the integers1, 2, ..., W.Inline4 
we add J*2 to S. Since S is initially given the value O, the value of S after 
a given execution of line 4 is the sum of the first J squares. Execution 
continues until, at line 5, the value of Z becomes W+1. After the branch 
back to line 3, execution is stopped by the conditional branch. 

The procedure SUMOFSQUARES illustrates the three aspects of a loop. 
The loop variable J is initialized in line 2, incremented in line 5, and tested 
in line 3. It is common practice in APL programming to combine the in- 
crementing and the testing steps. 


VS<SUMSQ N;I SUMSQ 3 
[1] S<+I+0 14 
[2] >+(N<I+I+1)/0 SUMSQ 10 
[3] S+S+Ix*2 385 
C4] ->2 SUMSQ 0 
[5] V 0 


Because J is incremented before the test is made even on the first pass 
through the loop, it is necessary to initialize [ with one less than the first 
value actually desired, that is, with 0. 

There is another approach possible to the problem of computing the 
sum of the first V squares. 


VS+RSUMSQ WN RSUMSQ 3 
[1] >+(W>0)/3 14 
[2] +5<0 RSUMS@ 10 
L3] S<(RSUMSQ N-1)+N*2 385 
C4] V RSUMSQ O 
@) 


The idea behind RSUMSQ is simple. In order to compute the sum of the first 
N squares, it is enough to add W*2 to the sum of the first V-1 squares. The 
sum of the first V-1 squares is computed by adding (V-1)*2 to the sum 
of the first 7-2 squares, and so on. Obviously, this method is not going to 
work unless there is some value of WV for which we already know the sum of 
the first V squares. In RSUMSQ@ the sum of the first O squares is taken to be 
O. (This is consistent with the standard mathematical convention that de- 
fines a sum with no summands to be 0.) For any positive integer WV the value 
of RSUMSQ WNisthe sum of V¥*2 and RSUMSQ N-1. 

The use by a procedure of itself is called recursion. There is a close 


DEFINED PROCEDURES 429 


connection between recursive procedures and proofs by induction. An induc- 
tion proof that some mathematical object exists can often be converted into 
a recursive procedure for constructing the object. 

There are some practical difficulties in using recursive APL procedures. 
Each time one procedure uses another procedure, additional memory space 
in the computer is required. Since memory is limited, this means that pro- 
cedures cannot continue invoking other procedures indefinitely. 

It is possible to write APL procedures that will never terminate by 
themselves. Perhaps the simplest is the following. 


VINFINITELOOP 
[1] +1 
[2] V 


‘Here the single line of the procedure consists of a branch to itself that con- 
tinues indefinitely until there is some outside intervention. Procedures of 
this type are said to contain an infinite loop. Infinite loops often occur 
unintentionally. If a procedure seems to be taking longer than expected to 
finish, one should signal attention by hitting the attention or break key on 
the terminal. This causes the procedure to be suspended between two lines. 
One may now look at the values of the variables in the procedure to see if 
everything is going as intended. Computation may be resumed if desired by 
typing > followed by the number of the next line to be executed. 

Additional information concerning suspended procedures can be found 
in Sections A2.3 and A2.5 and in APL reference manuals. A word of caution 
is in order, however. Beginning students of APL often forget to terminate 
suspended procedures when they are no longer needed. This can cause prob- 
lems, especially when local variables having the same names as global vari- 
ables are involved. It is a good rule to terminate suspended procedures as 
soon as possible by typing > followed by a carriage return. Within a pro- 
cedure, a right arrow by itself aborts the entire calculation currently under 
way. 

There is one more point concerning branching that needs to be men- 
tioned. The editing of a procedure may cause line numbers to change. If 
the line number for a particular statement changes, then any branches to 
that statement will also have to be changed. Since procedures often go 
through many versions before they reach their final form, a great deal of 
work can be saved by the use of statement labels. The following version 
of SUMS® uses a label LOOP on line 2. 


430 THE APL LANGUAGE 


VS*+SUMSQX N3I 
[1] S<I<0 
[2] LOOP:>(N<I<I+1)/0 
[3] S<+S+Ix*2 
C4] >+LOOP 
[5] V 


Statement labels are local variables that are automatically given a value equal 
to the number of the line on which they occur. A label is separated from the 
remainder of the statement by a colon. It is good programming practice to 
use statement labels whenever possible. 

Loops need not involve loop variables that count passes through the 
loop. 


VY<+SQRT X 
[1] >+(0>X)/0 
[2] +(O0=Y<X)/0 
[3] BACK :>+(X#YxXY<( Y+X+Y)+#2)/BACK 


C4] Vv 
SQRT 43.56. 
6.6 
SQRT 0 
0 
I<SQRT 1 
VALUE ERROR 
I<+SQRT 1 
A 


The procedure SQRT computes the square root Y of its argument X. For 
nonnegative values of X we take X to be a first approximation to Y. If X is 
0, we stop immediately. Otherwise, given any value for Y, we replace Y by 
the average of Y and X+Y and continue until X and YxY agree to the ac- 
curacy specified by the comparison tolerance. (See Section A2.2.) If X is 
negative, then the procedure is terminated in line 1 before any value is as- 
signed to Y. This accounts for the error message printed in the previous 
example. 


EXERCISES 
1 Enter the definitions of each of the procedures presented in this 
section and try them out. 


2 Define and use a procedure called AVERAGE that returns the av- 
erage of its two arguments. 


PRIMITIVE MIXED OPERATIONS 431 


3 Define and use a procedure POWERS with one argument X and no 
explicit result that creates three global variables X2, X3, and 
X4, which are the square, cube, and fourth power of X, respec- 
tively. 

4 Define and use a procedure PARITY with a single argument J, 
assumed to be an integer, that returns the value 1 if WV is odd and 
the value 0 if WV is even. 


5 Write a procedure POWERSUM modeled after SUMSQ that has two 
arguments M and WV and that returns the sum of the Mth powers of 
1,2,...,W4. 

6 Write a recursive version RPOWERSUM of POWERSUM based on 
RSUMS®. 


7 Write a recursive procedure FACT for computing factorials that 
does not involve the use of the primitive operation !. Use the 
identity ('V) = NWx!N-1 and the starting value of 1 for ! 0. 


8 The Fibonacci sequence is the sequence 1, 2, 3,5, 8, 13,..., in 
which each term after the second is the sum of the two terms 
preceding it. Construct a recursive procedure F7B such that FITB I 
is the value of the 7th term in the Fibonacci sequence. 


9 Write a procedure FTBSUM such that FITBSUM N in the sum of the 
first V terms of the Fibonacci sequence. Do not use recursion or the 
procedure FIB of Exercise 8. 


10 Any positive integer n can be written uniquely in the form 2’m, 
where r and m are integers and m is odd. Call m the odd part of 
n. Write a procedure ODDPART such that M<ODDPART N makes 
M the odd part of W. Use a loop similar to the one in SQAT. This 
can be done by taking Y to be N initially and then replacing /” by 
M +2 as long as ™ is even. 


5. PRIMITIVE MIXED OPERATIONS 


In this section we consider a class of primitive APL operations that are not 
applied entry by entry to their arguments. The operations in this class are 
usually called mixed. 

Table 1 lists the monadic mixed operations. Note that there are three 
different ways of denoting the operation reverse. 


432 THE APL LANGUAGE 


Symbol 

0 Shape 

1 Index generator 
de O[T] Reverse 

nN Grade up 

Y Grade down 

Q Transpose 

) Ravel 

EH Matrix inversion 

C7 Execute 

e Standard format 


Table 1. The Monadic Mixed Operations 


We have already seen the shape and index generator operations in Section 2. 
The operations H, ¢, and ? are described in Section 8. We will discuss the 
remaining five operations in Table 1 one at a time. The symbols 9, 6, &, 
4, Y, 2, ¥, and & are entered at a terminal by overstriking 0,0,0,A,V, 1, 
T,and by |,-,\, |, 1], 0°, °,and +, respectively. 

If V is a vector, then $V is the vector obtained from V by reversing the 
order of the components. 


OV+5 32 1 o14 
1235 4 3 2 1 


With matrices, the symbol $ indicates the reversal of the rows, while © indi- 
cates reversal of the columns. 


[]<+A<+2 3016 eA 

12 3 4 5 6 

5 6 1 2 3 
oA 


ev) 
1 NO 


1 
L 


If C is an array of rank greater than 2, we can specify reversal along any of 
the ‘‘axes”’ of C by using the indexed form OL TI]. 


PRIMITIVE MIXED OPERATIONS 433 


U<C<+2 2 2918 oL21¢ 
1 2 3.4 
3 4 1 2 
5 6 7 8 
7 8 5 6 
oL1ijc dL31¢ 
5 6 2 1 
7 8 4 3 
1 2 6 5 
3. 4 8 7 


The symbol > always denotes reversal along the last axis, while © always 
denoted reversal along the first axis. The indexing in >L J ] is subject to the 
index origin, just as in the case of array indexing. 


odC1]A odL1]A 
4 6 3.21 
1 3 6 5 4 
OIo0<0 OIO<1 


fs) 
2 


The grade up operation 4 takes a vector V as an argument and produces 
a vector P that is a permutation of 19 V such that the components of VLP J 


are in increasing order. ° 
V AW<+2 17 4 
5 32 1 214 3 
AV WLAW] 
4 3 2 1 124 7 
VLAV] 
12 3 5. 


If the components of V are not distinct, then P+AV is chosen so that when- 
ever 7 < JandVLPLIJ] = VUPLJ]],then PLT] < PLZ]. 


A1 2 3 1 2 3 
142 5 3 6 


434 THE APL LANGUAGE 


This determines P uniquely. The result of A is origin dependent. 


AV AV 
4321 3210 
OI0<0o OIo+1 


The grade down operation Y-is defined very much like A. The com- 
ponents of VLYV] are in decreasing order, however. 


VW WLYW] 
34 1 2 7421 


The transpose operation &, when applied to a matrix, gives the usual 
matrix transpose. 


A QA 


(1 RO 
Mm Ww 
© NM FP 
Om £ 


If X is any array, then &X is the array obtained by reversing the order of the 
indices. If X has rank 4 and Y+®X, then YLI;J;K;LJisxlU;K3J;JI).lf 
X is a scalar or a vector, then &X is the same as X. 


o0D<+2 3 40124 0 QD 
2 3 4 h 3 2 


The ravel ,X of an array X is the vector that lists the entries of X in the 
order they would be printed at a terminal. 


A wt 
1 2 3 1 
4 5 6 p 1 
~A 1 
123 4 5 6 >2 2 2018 
V 1234 5 67 8 
5 32 1 
JV 


PRIMITIVE MIXED OPERATIONS 435 


The dyadic mixed operations are given in Table 2. 


Symbol Name 
p Reshape 
l Index of 
E Membership 
+ Drop 
+ Take 
/ # /(T] Compress 
\ & \LTI Expand 
ode oLr] Rotate 
» »Lll] Catenate, laminate 
Q Generalized transpose 
? Deal 
1 Decode 
T Encode 
Bi Matrix division 
e Format 


Table 2. The Dyadic Mixed Operations 


The last four operations—decode, encode, matrix division, and format—will 
be described in Section 8. The other operations are discussed in the re- 
mainder of this section. 

The reshape operation was introduced in Section 2. In general, Vp W is 
defined whenever V and W are vectors or scalars and the entries of V are 
nonnegative integers. If none of the components of V is O, then W must 
be nonempty. The result is an array whose shape is , V and whose entries 
are taken from VW. 


O<+B+<+3 49120 5e1 
1 #2 3 4 1141411 
5 6 7 8 02 2 2 2p0 
9 10 11 12 22 2 2 
2 3014 03 0 2 Opo1 
12 3 3 0 2 0 
4 1 2 


If W has too many entries, then only enough entries are taken from the 
beginning of W to form the desired array. If W has too few entries, then 
the entries of W are repeated. 

In the expression V’1X the array V must be a vector, but X may have 


436 THE APL LANGUAGE 


any rank. The result is an array Y with the same shape as X. Each entry in 
Y is the index in VY of the first occurrence of the corresponding entry in 
X. If some entry of X does not occur as a component of V at all, then one 
more than the largest valid index for VY is used for the entry in Y. The re- 
sult of V1X is origin dependent. 


V WiV 
5 32 1 5 5 1 2 
V13 A 
2 1 2 3 
W 4 5 6 
21474 Vid 
ViW 4 3 2 
3 4 5 5 5 1 5 


The APL membership operation « is closely related to the usual no- 
tion of set membership. If A and B are arrays, then C<AeB defines C to be 
the logical array with the same shape as A such that an entry of C is 1 if and 
only if the corresponding entry of A occurs among the entries of B. 


A AewW 
1 2 3 1 1 0 
4 5 6 10 0 
W WeA 
21474 11041 


If V is a vector and WV is a scalar or a vector of length 1, then V¥YV is 
the vector obtained by dropping or omitting the first V components of / 
if V>0O or the last | components of V if V< 0. 


V 2+V 
5 32 1 5 3 

24V 05+V 
2 1 0 


Rows and columns may be dropped from a matrix. 


B “1 14B 
1 2 3 4 12 3 
5 6 7 8 5 67 
9 10 11 12 “41 24B 
1 14B 3 4 
6 7 8 7 8 


10 11 12 


PRIMITIVE MIXED OPERATIONS 437 


The take operation + is defined in much the same way as +, but now the 
first argument indicates the part of the second argument to be kept or taken. 


V B 
5 321 1 2 3 4 
B4V 5 6 7 8 
5 3 2 9 10 11 12 
347 2 24B 
321 1 2 
5 6 


If we try to take more than is present, then zeros are added at the end or the 
beginning, depending on the sign of the first argument. 


V ~64V 
5 321 005321 
TAV 


5 3 210 0 0 


In the case of character arrays, spaces are used instead of zeros. 

Let V be a vector and let X be a logical vector of length p V. The com- 
pression X/V of V by X is the vector consisting of the components of V cor- 
responding to the 1’s in X. 


V (~X)/V 
5321 2 

X+1 101 (V>2)/V 

X/V 5 3 


5 3 1 


An array may be compressed along any axis. The symbols /, 4, and /LJ ] are 
used to denote compression along the last, first, and Ith axes, respectively. 


B 10 1/L1JdB 
1 2 3 4 1 2 3 4 
5 6 7 8 9 10 11 12 
9 10 11 12 UI0+0 
xX X/C1I1B 
11041 1 2 4 
X/B 5 6 8 
1 2 4 9 10 12 
5 6 8 OIO<+1 
9 10 12 
1 0 1#4B 


(O 
}4 
Oo N 
}4 
H+ GW 
}4 
NO 


438 THE APL LANGUAGE 


As we saw with the reverse operation, the indexing in /LJ] is the subject 
to the index origin. 

The operation of expansion is almost inverse to that of compression. 
Let V be a vector and let X be a logical vector with exactly o V entries equal 
to 1. The expansion X\V of V by X is the vector of length oX with the com- 
ponents of VY in the positions corresponding to ones in X and 0’s elsewhere. 
(If V is a character vector, then the 0’s in X correspond to spaces in X\V.) 


V X\V 
5 3 2 1 9 0 3 2 0 1 
xX*+1 01101 


An array may be expanded along any one of its axes, with \, \, and \LI] 
being used completely analogously to /, ¢,and /LJ]. 


B 101 1B 
12 3 4 1 2 3 4 
5 6 7 8 0 0 0 0 
9 10 11 12 5 6 7 8 
X\B 9 10 11 12 
1 0 2 3 0 4 X\L2J]B 
5 0 6 7 0 8 1 0 2 3 0 4 
9 01011 0 12 5 0 6 7 0 8 
9 01011 012 


If V is a vector and W is a positive integer, then VV is the vector ob- 
tained by rotating V to the left WV places. Any components shifted off the 
left end reappear at the right. If V<0, then NOV is the result of rotating 
V to the right |W places. 


V ~1OV 
5321 15 3 2 

20V TOV 
21 5 3 153 2 


Arrays may be rotated along any axis, with >, ©, and LJ] being used to 
indicate which axis. 


B ~~ 10eB 
1 2 3 4 9 10 11 12 
5 6 7 8 1 2 3 4 
9 10 11 12 5 6 7 8 
1B 2oL2]B 
2 3 4 1 3 41 2 


oO) 
~ 
0O 
on 
~ 
0O 


9 #6 
10 11 12 9 11 12 9 10 


PRIMITIVE MIXED OPERATIONS 439 


It is possible to rotate the rows or columns of a matrix by different amounts. 


B 0 2 1 208 
1 2 3 4 11011 12 
5 6 7 8 5 2 3 4 
9 10 11 12 9 6 7 8 
O 1 20B 
1 2 3 4 
6 7 8 5 
1112 9 10 


The components of the first argument give the amount the rows or columns 
are to be rotated. 

If V and W are vectors, then V,W is the vector obtained by following 
the components of VY with the components of W and is called the catenation 
of V and W. 


V VW 
5 3 2 1 53212174 
W WV 
21474 21474 5 3 2 1 


Two arrays A and B of the same positive rank may be catenated along any 
axis, provided their dimensions along the other axes agree. The expression 
A,B denotes catenation along the last axis. If some other axis is desired, 
then the form A ,LZ JB must be used. 


A A,L1IJA 
1 2 3 1 2 3 
4 5 6 45 6 

A,A 1 2 3 
123 12 83 4 5 6 
4564 5 6 


The catenation A,B is also defined if the ranks of A and B differ by 1 or 
if one array has positive rank and the other is a scalar. Here are a few ex- 
amples. 


A A,£1)13 
1 2 3 1 2 3 
4u 5 6 4 5 6 
A,12 1 2 3 
12 3 1 7,A 
4 5 6 2 7 12 3 
(12),A 7 4 5 6 
112 3 
24 5 6 


440 THE APL LANGUAGE 


Two arrays A and B of the same shape may be laminated to form an 
array of rank 1+poA in which the dimension along the new axis is 2. If 
A and Bare scalars, the result is written A ,B and is a vector of length 2. If A 
and B have positive rank, then one must write 4, LZ ]B, where J is not an 
integer. This indicates that the new axis is to be created after the (LJZ)th 
axis and before the (| Z ]th axis. 


1,2 V,C1.5]W 
1 2 5 2 

V 3 1 
5321 2 7 

W 1 4 
2474 OI0<0 

V,CO0.5]W V,C 0O.51W 
5321 5321 
2474 21474 

OIO+«+1 


The most common use of the generalized transpose operation is in ob- 
taining the main diagonal of a matrix. If B is a matrix, the main diagonal 
of B is the vector whose [th component is BLI;I]. 


B OI0<+0 
1 2 3 4 O OQB 
5 6 7 8 1 6 11 
9 10 11 12 (ITO<+1 
1 1B 
1 6 11 


In origin 1 the main diagonal of B is 1 1B and in origin 0 it is 0 OQB. 
If X is any array and V isa suitable vector of length poX, then Y<VQ&X is 
the array in which the /th index of X is taken to be the YL JZ ]th index of 
Y. The required condition on / is that the set of components of V be pre- 
cisely the same as the set of components of 1 for some integer /. 


D U<F<1 2 1D 
1 2 3 #4 1 5 9 
5 6 7 8 14 18 22 
9 10 11 12 FLZ3;3] 
22 
13 14 15 16 DL23;33;2] 
17 18 19 20 22 


21 22 23 24 


The expression 1 2 18D denotes the array obtained from D by setting 


PRIMITIVE MIXED OPERATIONS 44] 


the first and third indices equal, that is, FLI;/] is DLI;J3;J J. In origin 
QO we would have to write 0 1 O&D. Here are some additional examples. 


4 1 18D 2 1 2aD 
1 18 1 14 
1 1 2D 5 18 
1 2 3 4 9 22 


17 18 19 20 


If M and W are nonnegative integers with M< JN, then M?N isa vector of 
length “ whose components are chosen randomly from iW without repeti- 
tions. In particular, V?N is a random permutation of iJ. 


3710 10710 

9 3°95 43145 9 81067 2 
3710 

59 8 9 


This dydaic operation ? is called deal. 


EXERCISES 


In the following exercises assume that the index origin is 1. Exercises 1 and 
2 refer to the following arrays. 


x R 
1031101 1 0 0 0 

Y B 
11031003111 1 3 5 

Z 2 4 6 


4212 3 4 


1 Evaluate each of the following APL expressions on paper and check 
your answers by entering the expressions at a terminal. 


(a) 02 (d) AZ (g) QF 
(b) oF (e) YZ (h) &Z 
(c) OL2]2 (f) ZLAZ] (i) .£ 
2 Evaluate the following expressions as in Exercise 1. 
(a) 2 4pZ (g) 3 30R (m)1 26£F 
(b) 2201 (h) 342 (n) Z,0 
(c) Zz (i) 7+Z (0) E,C1]E 
(d) FeZ Gj) Y\Z (p)Z,[0.5]X 
(e) X/Z (k) 162 (q) Z,01.5]X 


(f) 362 (I) X,Y (r) 1 182 


442 THE APL LANGUAGE 


3 Suppose G is an array of rank 4. Write an expression defining 
4H to be an array of rank 3 such that ALIT;/73;K] is GLI3K3J7;T7)]. 


4 For what vectors V are YV and ¢AV the same? 


5 Let V be a vector and let X and Y be logical vectors such that 
X\V and Y/V are defined. Describe the vectors X/X\V and Y\Y/V. 


6. REDUCTION AND SCAN 


In traditional mathematical notation the sum of the components of a vector 
v=(),,...,Y,) 1s written 


and the product of the components is written 
nr 
v;. 
7=1 


In APL, the sum and product of the components of the vector V are de- 
noted +/V and x/V, respectively. 


V<2 3 5 7 x/V 
+/V 210 
17 


More generally, if f is any dyadic scalar primitive operation, then f/V de- 
notes the value of the APL expression 


VOL1IfVL20f...fV0N], 


where WV is the length of V and, for purposes of illustration, we are assuming 
the index origin is 1. We call f/V the f-reduction of V. Here are some ex- 
amples. 


2131517 2131517 


[/V |/V 


The { -reduction [| /V gives the maximum component of V while | /V is 
the minimum component. 

If X is a logical vector, then v/X is 1 provided at least one component 
of X is 1 and 4/X is 1 only when all the components of X are 1. 


REDUCTION AND SCAN 443 


X+1 101 0 A/X 
V/X 0 


Thus, to test equality of vectors VY and W of the same length, we form 


A/V=W., 
V V=W 
23 5 7 114041 
W<+2 3 6 7 A/V=W 
0 


If A and B are arrays of the same shape, we check equality of A and B with 
the statement A/ ,A=B. 


LI+A<+B<2 3016 A/,A=B 

1 2 3 

4 5 6 

The operation of f+reduction satisfies the condition that f/V ,W is the 
same as f/V.f/W for any nonempty vectors V and VW. 


V +/VW 
23 5 7 35 

W +/V,t+/W 
23 67 35 


In order for this condition to hold when VW is 10, it is necessary to define 
f/10 to be a right identity element for f, provided such an identity ele- 
ment exists. 


+/10 [/10 

0 ~7.237005577E75 
x/10 L/10 

1 7.237005577E75 
-/10 


0 


The identity element for [ should probably be denoted —o, but | /10 is 
defined to be the smallest number representable on the particular terminal 
system being used. Similarly, | /10 is the largest number on the system. 
The f-reduction of a vector of length 1 is the single component of that 
vector. 


444 THE APL LANGUAGE 


+/,6 x/ 46 


Reduction can be applied to arrays of any rank. The f-reduction of a 
scalar S is just S. If A is a matrix, then f/A is the vector whose Jth entry 
is f/ALI;], the f-reduction of the [th row of A. Reduction along the col- 
umns is denoted fA. 


A +7A 


The indexed form f/L[ I] may also be used. 


+/L2]A +/L1iJA 
6 15 5 7 9 


Reductions of arrays of rank 3 and larger are similarly defined. 

The f-scan operation, which is denoted f\, is closely related to f- 
reduction. Although the result of the f-scan is independent of the index 
origin, it is easier to describe with origin 1 indexing. If V is a vector, then 
W<f\V is the vector with the same length as V such that WLI] is f/I+V. 


V x\V 
23 5 7 2 6 30 210 

+\V A\t 1120410 
2 5 10 17 11100 0 


The components of + \V are often called the partial sums of VY and the com- 
ponents of x \V are called the partial products. 

For any array A the f-scan f\A is an array with the same shape as A. 
The axis along which the scan is performed is indicated in the usual way. 
A scan along the last axis is f\, along the first axis is f\. and along the 
Tthaxisisf\CI], 


A +\A 
1 2 3 1 2 3 
4 5 6 5 7 9 
+\A +\L2]A 
1 3 6 1 3 6 


4 9 15 4 9 15 


INNER AND OUTER PRODUCTS 445 


EXERCISES 


1 Let V be a vector of nonzero numbers. Describe -/V and +/YV in 
traditional notation. 


2 For any matrix A the numbers +/+/A and +/,A are equal. For 
what other scalar dyadic operations f are f/f/A and f/,A the same? 


3 How is it possible to test two arrays A and B for equality without 
running the risk of producing an error message when A and B may 
have different ranks or different shapes? 


4 Let A be a matrix. Is it possible to say which is larger, | /L/A or 
L/[ /A? 

5 Let A be a matrix. Show that +\+\A and +\+\A are always the 
same and describe this matrix. 

6 Write an APL expression for the sum of the squares of the integers 
from 1 to ¥. Compare this expression to the procedure SUMSQX 
in Section 4. 


7. INNER AND OUTER PRODUCTS 


Suppose x = (*,, X,, X3) and y = (1, ¥5, ¥3) are three-dimensional real 
vectors. Then the inner product of x and y is defined to be x,y, +xX5y2 + 
x3¥3. Using APL notation, we can write the inner product of X and Y 
as + /XxY.Wecan also write itasX+.x/Y. 


X<«2 1.3 _ X+.xY 
Y¥<1 4 2 8 
+/XxY 


The operation +. is one example of an APL inner product. The inner 
products form the largest single class of APL primitive operations. Suppose 
f and g are two dyadic scalar primitive operations and X and Y are vectors 
of the same length. The inner product Xf.gY is defined to be //XgY . 


Xx.+Y _ X+.LY 


x/X+Y - +/XLY 


If one of the operands X and Y has length1, it is expanded to match the 
length of the other operand. 


446 THE APL LANGUAGE 


L<+,2 X+.X2 2 2 
X+.XZ 8 


There are hundreds of possible inner products, and a surprisingly large num- 
ber of them have useful applications. The reader should experiment with a 
variety of inner products to get a feel for their utility. 

It is possible to form the inner product of arrays of any rank. If one 
of the operands is a scalar, then it is converted to a vector of length 1 before 
proceeding. If A and B are arrays of positive rank, then Af. gB is defined 
only if A and B are conformable in the sense that the last dimension 1+9A4 
of A is equal to the first dimension 1+o8 of B or one of these dimensions 
is equal to 1. If A and B are conformable, then Af.gB has rank two less 
than the sum of the ranks of A and B andhas shape ("1494 ),1+ 08. The 
entries of Af.gB are calculated by forming all possible inner products of 
vectors along the last axis of A with vectors along the first axis of B. For 
example, if A and B are matrices, then C<«Af.gB defines C to be the matrix 
such that CLI;J] is ALI; ]f.gBC;J], the inner product of the Ith row 
of A and the /th column of B. 


_ U+A<+2 291 2 3 0 


1 2 
3 @) 
O+B<+2 202 1 ° 1 3 
2 1 
“1 3 
L1+C<+A+.xB 
4 5 
6 3 
AL1;J+.xBL31] 
mm 
ClL1;1] 
mm 
AL2;J]+.xBL31] 
6 
ClL23;1] 
6 


Here we see that the +. inner product for matrices is just the usual matrix 
product. 

Another important inner product is v.A, which is defined only for 
logical arrays. 


INNER AND OUTER PRODUCTS 447 


U<«R<+3 20100111 


1 0 
0 1 
1 1 
H<+S<+2 201 10 1 
1 1 
0 1 
f1<7T<+RV.AS 
1 1 
O 1 
1 1 
RL2;]JV.ASL31] 
@) 
TL23;1] 
@) 


If A is a matrix and / is a vector, then Af. gV is the vector whose Jth 
component is ALI; ]f.gV. Similarly, Vf.gA is the vector whose Ith com- 
ponent is Vf.gA[;I]. 


A A+.xV 


3 0 V+.xA 


V+2 1 1 4 


As a final example, let us consider the case in which A is a matrix 
and B has rank 3. If C is Af.gB, then C has rank 3 and CLI;:J;K] 
isALI3;]f.gBlL3J7;K]. 

Besides the inner products, APL also contains operations called outer 
products, one for each dyadic scalar primitive operation f. If V and W are 
vectors, then the outer product Vo .fW is the matrix A such that ALI;J7] 
isVLIIfWld ]. 


V+2 7 Vo. |W 
We 3 4 1 0 
YVo.t+Ww yy 
4 6 (13)°.=13 
4 411 100 
Vo.xW 0 1 0 
6 8 001 
“21 28 


In general, the outer product Ac .fB is formed by applying f to all pos- 


448 THE APL LANGUAGE . 


sible pairs of an entry in A and an entry in B. The rank of Ao .fB is the sum 
of the ranks of A and B and the shape of 4° .fB is (pA).0B. 


EXERCISES 


1 Assume V+4 3 1 and W+2 1 1. Evaluate the inner product 
of V and W for each of the following inner product operations by 
hand. Check yourself by entering the appropriate expression at a 


terminal. 

(a) +.% (d) 4. = (g) *. 
(b) -.+ (e) V.< (h) +.L 
(c) [ .+ (f) +.7 (i) LT 


2 Let V and W be as in Exercise 1. Evaluate the outer product of 
V and W for each of the following outer product operations by 
hand. Check your answers at a terminal. 


(a) o.,+ (c) °.= (e) o.[ 
(b) °.x (d) °.| (f) °.# 

3 Show that whenever A, B, and C are conformable arrays with 
2<oB, then A4+.xB+.xC isthesameas (A+.xB)+.xC. Give an 
example showing that this is not always true when B is a vector. 


*4 Determine which inner products satisfy the associativity property 
described in Exercise 3. 


8. SOME ADDITIONAL OPERATIONS 


In this section we will investigate six more APL operations—the decode, 
encode, matrix inverse, matrix division, execute, and format operations— 
whose definitions are somewhat more involved than the definitions of the 
operations we have discussed so far. 

The decode operation 1 and the encode operation T are useful for 
working with radix notation using bases other than 10, for “‘packing’’ many 
small integers into a single large integer to save space, and for certain com- 
putations with polynomials. If B and U are vectors of the same length, 
then BU is defined to be W+.xU, where W is the weighting vector of length 
oB defined as follows: WLoW] island WLI-1] is BLT ]xWLI J. For ex- 
ample,ifBisi10 5 3 2,thenWis30 6 2 1. 


SOME ADDITIONAL OPERATIONS 449 


B<10 5 3 2 W+.xU 

W+30 6 2 1 153 

U<4 3 7 1 Biv 

V+2 13 1.5 61.5 

Biu W+.xV 
153 61.5 


If B isa scalar, then B is extended to form a vector of length 0 VU. 


104U 111U 
4371 9765 


Here we see that BU gives the decimal representation for the integer whose 
digits in radix notation using the base B are given in the vector U. 

The decode operation can be used to evaluate polynomials. For ex- 
ample, to evaluate 3X3 — X* +2X +5 at X =7, we form 


7143 #242 5 
999 


Note that the coefficients must be listed in the order of decreasing powers 
of X. 

Decode is extended to arrays of any rank in a manner similar to inner 
products. If either B or U is a scalar, then it is converted into a vector of 
length 1 before beginning to evaluate B1U. If B and U are conformable for 
inner products, then B1U is obtained by forming all possible numbers 
C1V, where C is a vector along the last axis of B and V is a vector along the 
first axis of U. 


O<R<+2 304 5326 1 
4 5 3 
2 6 1 
O<+S+3 296 13 4 2 2 
6 1 
mm 
2 2 
[(1<+T<+RiLS 
97 4 
Ud 4 
RL1;)]1S5032] 


ev) 


TL13;2] 


450 THE APL LANGUAGE 


The encode operation can be used to compute the vector of digits in 
the representation of an integer in any specified base. 


10 10 10 1072573 
25 7 3 

(508)T2573 
0501 5 


The entries in the arguments of T do not have to be positive integers. 


2.5 1.7 4.8 3.1 7 7.3 
1.5 0.7 2.8 1.1 


If B isa vector and X is a scalar, then U<B TX is defined as follows. 
1. If Bis empty, then U<10. 
2. If 0 = 1+B,then U+(-p9B)+X. 
3. If 0 # D+ 1+B, then U is defined recursively to be 
(( 14B)T@),Y, 
where Y+D|X and Q@<(X-Y)#D. 


Here are some additional examples. 


0(10)T7 
0 
5 6 7 0T13 
000 13 
B<+2.5 1.7 4.8 3.1 
X<+7.3 
O<Y¥<3.1/X 
1.1 
O<Q<+(X-Y)#3.1 
2 


(( 14¥B)TQ),Y 
1.5 0.7 2.8 1.1 
BTX 
1.5 0.7 2.8 1.1 


Encode is extended to arrays of any rank in a manner similar to outer 
products. If B and X are any arrays, then BTX is obtained by computing all 
possible vectors CTY, where C is a vector along the first axis of B and 
Y isanentry in X. 


SOME ADDITIONAL OPERATIONS 451 


10 10 107932 17 485 


X<126 37 98 
oU<BTX 


UL3;23;3] 


BL3;2]TXL3] 


The encode operation can be used to construct matrices of charac- 
teristic vectors. 


OI0<0 OI0<1 
H<+V<+2 2 2T18 

00004111 

00141001 1 

O140%120%1 0 1 


The columns of V are the characteristic vectors for the eight subsets of 
13. 

Decode and encode can also be used to pack a vector of small integers 
into a single large integer in order to save space and speed up certain types 
of comparisons. If 


U<+A<+3 496 1423510326484 


Wow Mm 
NO Oo 
Nr 
FON 


then all of the entries in A are nonnegative integers less than 7. We can 
pack A into the vector 


452 THE APL LANGUAGE 
O<+W<+71A 
318 86 209 102 


Each component of W describes a column of A. We can recover A from 
W, if needed, as follows. 


7 7 7TW 
6 1 4 2 
3.5 1 0 
3 2 6 4 


The next two operations are represented by the symbol H, which is 
called domino and is formed by overstriking L] with +. The monadic oper- 
ation is called matrix inverse, and the dyadic operation is called matrix 
division. 

If A isa nonsingular square matrix, then HA is the inverse of A. 


O<+A<+2 2 93 5 2 3 


3 5 
2 3 
O<+C<HA 
3 5 
2 3 
A+.xC 


1.000000000£F0 ~.993605777EF 15 

4.440892099F 16 1.000000000£F0 
C+.xA 

1.000000000F0 6.883382753EF 15 

0.000000000£F0 .0O00000000F0 


~ 


> 


The result of HA on an APL system is actually only an approximation 
to the inverse of A. In the preceding example, the matrix printed for C is 
the exact inverse of A, but the matrix A+.*xC is not quite the identity 
matrix. This happens because some rounding of the entries in C takes place 
when C is displayed. The value of C in the system is not exactly the in- 
verse of A. 

If B is a matrix or a vector, then X<BEA is (HA )+. xB, which is the 
unique solution of the linear equation \/ ,B=A+.xX. This equation would 
be written AX = B in conventional notation. 

The expressions HA and BEA are defined more generally when A is a 
matrix whose columns are linearly independent, which implies that A has at 
least as many rows as columns. Suppose the columns of A are linearly inde- 
pendent and B is a vector of length 1+oA. There need not be any vector X 


SOME ADDITIONAL OPERATIONS 453 


such that A+.xX is equal to B, but there is a unique X that minimizes the 
expression + / (B-At+.xX)*2, which is the sum of the squares of the com- 
ponents of the difference B-A+.xX. This vector X is the result of BETA and 
is called the least-squares solution of the equation \/B=A+. xX. The vector 
A+.*xX is in the vector space W spanned by the columns of A and is, in fact, 
the vector in W closest to B. We call A+.xX the orthogonal projection of 
B onto W. The vector B-A.+.xX is orthogonal to W. If D is a matrix, then 
DIA is the matrix whose Jth column is DL ;7]HA. The value of HA is 
MBIA, where ™ is the identity matrix of size 1+pA. 


O<A+3 29112 112 


1 1 
2 1 
1 2 
O<+D<+3 2908 5221 4 
8 5 
2 2 
1 4 
DEA 
2 1 
1 1 
DL;2]HA 
1 1 


The symbol ¢, made by overstriking 1 with °, represents the execute 
operation, which allows an APL procedure to construct an APL statement as 
a vector of characters and then execute that statement. If S is a vector of 
characters that is a well-formed APL statement, then the expression #5 
causes S to be executed. 


2t3rut e'J<I+3! 
7 J 
U<«I<2'2x5! 13 
10 


The last operation in this section is represented by the symbol ?, which 
is made by overstriking T with °. The operation # is called format, and it 
gives the user more control over the way output is printed at the terminal 
than is possible using the system variable LPP, which is discussed in Section 
A2.2. If A is an array, then *A is a character array corresponding to the 
way A would be displayed at the terminal. 


454 THE APL LANGUAGE 


O+A<2 2017 3 4 19 R 

17 3 17-3 

4 19 4 19 
R<FA R137] 
oR 7 


Format may be used to control the number of decimal places printed 
for entries in an array. For example, let 


O<7<(2 3916)*0.5 
1 1.414213562 1.732050808 
2 2.236067977 2.449489743 


To print out 7 with its entries rounded to two decimal places, we enter 
292, 


297 
1.00 1.414 1.73 
2.00 2.24 2.45 


We can spread out the entries by asking that six spaces be used for each 
entry. 


6 257 
1.00 1.41 1.73 
2.00 2.24 2.45 


Different spacing and different precision can be used in each column. 


4 26 3 4 1tT 
1.00 1.414 1.7 
2.00 2.236 2.4 


Here four spaces and two places after the decimal point are specified for 
the first column, six spaces and three decimal places for the second col- 
umn, and four spaces and one decimal place for the third column. Addi- 
tional information about ? can be found in the APL reference books. 


EXERCISES 


1 Use 1 to determine the decimal representations for the integers 
whose base 8 representations are 123, 4113, and 327614. 


SOME ADDITIONAL OPERATIONS 455 


Use tT to determine the base 7 representations of the integers 
5319, 97219, and 2763140. 
Evaluate 3X 3? — 2X? +4X +1 for X = 3, —2, and 1.5 using 1. 


Construct a matrix whose rows are the characteristic vectors for the 
16 subsets of 14. 


Compute the inverse of the matrix 


2 —-!l1 4 
0 6 —2 
5 3 


using fH. 

What would be the result of executing the statement 
@'3x' ,50'2+! 

on an APL system? Check your answer at a terminal. 


Let B be a vector, let X be a scalar, and let U«BTX. Show that 
B|U is U. Suppose J is an integer from 1 to 9B, and C<I+B and 
V+I+U. Show that C1V is (x/C) |X. 


Let A be a matrix whose columns are linearly independent and let 
B and C be vectors of length 1+oA. Show that ( B+C)HA is equal 
to ( BHA )+CHA. 


_ Appendix 2. 
APL SYSTEMS 


This appendix provides information about the way the APL language de- 
scribed in Appendix 1 has been implemented in computer systems. The main 
topics covered are editing of APL statements, modification of procedures, 
system variables, system commands, workspaces, debugging techniques, and 
programming efficiency. APL systems differ in many ways from one another. 
Your system may not behave exactly as described here. Contact your local 
installation for help. Additional material can be found in APL reference 
manuals. 


1. EDITING 


One might get the impression from reading the APL dialogues in this book 
that no one ever makes a mistake typing an APL statement and that pro- 
cedures always work correctly when they are entered and therefore never 
need to be changed. This is not the case. The purpose of this section is to 
show how to correct mistakes on the line currently being typed and how 
to modify procedures when they do not work as intended. First, however, 
it is necessary to discuss the types of terminals commonly used with APL 
systems. 

There are two basic standards governing the way terminals communi- 
cate with computers. These standards are referred to by their acronyms: 
EBCDIC (Extended Binary Coded Decimal Interchange Code) and ASCII 
(American Standard Code for Information Interchange). Terminals manu- 
factured by IBM are generally EBCDIC terminals, while most other terminals 
use the ASCII standard. It is not hard to distinguish between EBCDIC and 
ASCII terminals. On an ASCII terminal keyboard there is a key marked 
“CONTROL?” (often abbreviated “CNTRL’’). On an EBCDIC terminal there 
is no such key. On an ASCII terminal the key that interrupts a computation 
in progress is marked “BREAK” or “BRK’’, while on an EBCDIC terminal 
this key is marked “ATTN”’. 

If an error is discovered on a line before the RETURN key is hit, it is 
possible to backspace to the error and then retype the line correctly from 


456 


EDITING 457 


that point. On an EBCDIC terminal one backspaces to the error and hits the 
ATTN Key. The system causes the terminal to space downward one line and 
print the symbol v. The terminal spaces down another line, and the cor- 
rected version of the line may be entered. The following example illustrates 
the correction of an error while typing the statement 2x3x5x7 on an 
EBCDIC terminal. 


2xX3+5+ 
V 
x 5x 7 
210 


On an ASCII terminal one backspaces to the error and hits the LINEFEED 
or LF key. The terminal spaces down one line, and the correction is en- 
tered. The previous example would look like this on an ASCII terminal. 


2*3+5+4+ 
x5x7 
210 


The characters of an APL statement may be typed in any order. For ex- 
ample, it is possible to enter the statement 2x3x5x7 by typing 2 3 5 7 
and backspacing to fill in the missing times signs. 

The remainder of this section deals with methods of displaying and 
changing procedure definitions. Let us assume that we have entered the 
definition of a procedure PROC. To display this definition, we type V to 
enter the definition mode, the name of the procedure, the symbols LU], 
which indicate that the entire procedure is to be displayed, and another 
V to leave the definition mode. 


VPROC LUIV 
VY X PROC Y 
[1] SIM+«X+Y 
[2] DIFF<X-Y 
[3] QUOT<XiY 
C4] MOD<+X|Y 
Vv 


Now suppose that we decide to change PROC by entering a new line at the 
end. We enter VPROC and the system prompts for line L 5 ], which we enter. 


VPROC 
[5] EXP<X*Y 
Lé6] 


458 APL SYSTEMS 


The system prompts for the next line but, at this point, we realize that we 
forgot a line between lines L 2] and [3]. We ask to be prompted for line 
[2.5] and, when we are, we type the new line. 


[2.5] 
[2.5] PROD<XxyY 
[2.6] 


The system prompts for line L 2.6], but we do not wish to insert any more 
lines. Instead, we decide to delete line [4],which is not really needed. 
We enter L A4]. 


LA4] 
[4 J 


The system prompts for the line we deleted. (Some systems prompt for the 
next line.) Now we see that the variable SIM in line L 1] should really be 
SUM. We decide the quickest way to make the change is to retype the line. 
We ask to be prompted for line [ 1 ] and reenter the line. 


[1] 
[1] SUM<X+Y 
[1.1] 


At this point we would like to look at the current definition of PROC. 


LOI 
VY X PROC Y 
[1] SUM<X+Y 
[2] DIFF<X-Y 
L2.5] PROD<XxY 
[3] QUOT<X+Y 
L5] BXP<X*Y 


We are satisfied with PROC and leave the definition mode. The system now 
renumbers the lines of the procedure with consecutive positive integers. 
This can be seen by displaying the procedure again. 


SYSTEM VARIABLES 459 


VPROC [UIV 

VY X PROC Y 
Li] SUM<X+Y 
[2] DIFF<X-Y 
L3] PROD<xXxY 
C4] QUOT<X+Y 
L5] BXP<XxY 

Vv 


It is possible to request the definition of a procedure be displayed from line 
[NJ] on. This is done by entering [UV]. To have just line [WV] printed, 
enter LVL]. The system will type the line and then prompt for that same 
line. For the purposes of editing only, the header of the procedure is con- 
sidered to be line [0]. 

It would be inefficient to modify a long line in a procedure by re- 
typing it entirely. There is another method which can be used instead. The 
following example uses this method to change the header of our procedure 
PROC toread SUM<PROC X. 


VPROC Lous] 
LO] X PROC Y 

[/4 / 
Lo] SUM<PROC XV 


The expression [0OLJ8] indicates that we want to begin modifying line 
[0] at approximately position 8, counted from the left margin. Line [0 ] 
is printed, the terminal spaces down one line and the carriage or cursor stops 
under position 8. A / typed under a character indicates that character is to 
be deleted. A number from | to 9 typed under a character indicates that the 
specified number of blanks is to be inserted before the character. (More than 
9 blanks can be requested by typing a letter. The /th letter of the alphabet 
requests 5xJZ blanks.) After a carriage return, the line is retyped with the 
deletions and inserted blanks. The cursor is positioned at the first inserted 
blank. The line may now be edited in the usual manner. In particular, the 
blanks may be filled in. 

Additional characters may be added at the end of line LV ] by entering 
[VQ 0]. Line LW ] is typed, and the cursor is positioned at the end of the line. 


2. SYSTEM VARIABLES 


In an APL system there is a class of special variables, called system variables, 
that describe the environment in which the APL programs run. For us the 
most important system variable is the index origin LJ 0. However, there are 


460 APL SYSTEMS 


several other system variables that may occasionally be of use. These are 
LUcT, UPP, UPW, UAT, and UWA, the comparison tolerance, the printing 
precision, the printing width, the accounting information, and the work 
area available, respectively. 

When computations are performed in an APL system, the results are 
often only approximations to the ‘“‘true’’ values, that is, the values obtained 
using exact calculations in the field of real numbers. For example, if we let 
X<2%*0.5, we should not be surprised if X*2 is not exactly 2. For this 
reason, when values are compared in an APL system, they are not checked 
for exact equality but, instead, for equality within the tolerance specified 
by the system variable ICT. That is, A and B are considered equal if the 
absolute value of their difference does not exceed LUCY multiplied by the 
larger of the absolute values of A and B. Thus A=B is equivalent to 


(|A-B)sOCTx(|A)l |B 


The value of LCT may be any number from 0 to 1. The normal value is 
1E 13, that is, 10713. Setting OCZ to O produces exact comparisons. 


O<xX<+273 2=Y 
0.6666666667 0 

Y<+3xX OCT<1F 13 

y-2 2=Y 
~2.220446049F 16 1 

OCT<0 


Besides the relational operations <, <, =, >, >, and #, the operations 
[, L, e, and 1 are affected by the value of UCT. 


OcTr<o OCT<1F 13 
LY LY 

1 2 
1 21Y 1 21Y 

3 2 


The system variable LIPP indicates the number of significant figures 
that are to be printed when numeric values are displayed at the terminal. 
Typically, UPP can vary from 1 to 16, with 10 as the normal value. 


UPP<16 UPP<10 

L]<X<2*0.5 X 
1.£4244213562373095 1.414213562 

2*35 2*35 


34359738368 3.435973837H10 


SYSTEM VARIABLES 461 


Terminals differ in the number of characters that can be printed on a 
line. The system variable JPW can be assigned a convenient value less than 
the width of the terminal line to assist the system in formatting output for 
the terminal. 


OPW<40 
120 
123 4 567 8 91011312 13 
14 15 1617 18 19 20 
OPW<80 
120 
123 4 5 67 8 914041424142 13144 «25 16 «17 «18 «19~«20 


In order to determine the amount of CPU time used in a particular 
calculation, we can use the system variable LIAJ, which is a vector of length 
4. The only component of LAT that we need to be concerned with is the 
second, which tells the amount of CPU time used since sign-on. The units 
used depend on the particular APL system. Older systems may use 1/60-ths 
of a second, while newer systems usually use milliseconds. To time how long 
it takes to square a 10-by-10 integer matrix, we may enter the following. 


A+?10 10p100 B+A+.xA 
OALI DAI 
1001 74 223187 209210 1001 96 246006 230023 


On this system CPU time is measured in milliseconds, and computing B took 
96 — 74 = 22 milliseconds. 

The active workspace, the portion of the computer’s memory contain- 
ing the variables and procedures currently in use, has only a limited amount 
of space. The amount of unused space is given by the system variable UWA. 
The units as well as the maximum value of LWA depend on the particular 
system you are using. 


UWA UWA 
124280 120264 
Z<+11000 


Here 4016 units (in this case, bytes) of memory are required to store the 
array Z. Four bytes are needed to store each entry of Z. The remaining 
bytes are used to store other information, such as the name and the shape 
of Z. (On some APL systems the vectors 1W are stored using an amount of 
space that does not depend on JW.) It is not possible to assign a value to 
UWA. 


462 APL SYSTEMS 


In addition to system variables there are also special functions called 
system functions. These functions, which can be used within defined proce- 
dures, provide alternatives to some of the system commands described in 
Section 3. Two system functions permit a procedure to modify another 
procedure and to define new procedures. Thus APL programs can write 
and run other APL programs. Information about system functions can be 
found in APL reference books. 


3. WORKSPACES AND SYSTEM COMMANDS 


The variables and procedures with which we are currently working are stored 
in a portion of the computer’s memory called the active workspace. These 
variables and procedures are lost if we sign off without first saving the 
active workspace into our library. Each user has a library in which inactive 
workspaces may be saved for future use. The number of workspaces that 
can be stored in the library is set by the system manager. 

The APL system contains certain system commands that facilitate 
workspace management. Some of these commands allow us to save the 
active workspace into the library or bring a workspace, or part of a work- 
space, from the library into the active workspace. Other commands permit 
us to determine the status of our active workspace and modify that status if 
desired. All system commands start with a right parenthesis. 

To find out what workspaces are in our library, we issue the )ZLIB 
command. 


)LIB 
CLASSLIB 
EXAMPLES 
DEMO 


Here we see that in addition to the two workspaces CLASSLIB and 
EXAMPLES, which are used throughout this book, there is a workspace 
DEMO. To see what is in DEMO, we must first put a copy of it into the active 
workspace with the ) LOAD command. 


)ZOAD DEMO 
SAVED 10.39.26 08/09/79 


The )ZLOAD command does not affect the contents of the library. It merely 
makes a copy of the workspace and places the copy in the active workspace, 
destroying anything that may have been.in the active workspace. 

We can find out the names of the variables in DEMO with the )VARS 
command. 


WORKSPACES AND SYSTEM COMMANDS 463 


) VARS 00M 
A M X 0 

oA ox 
25 40 37 


There are three variables in DEMO, a matrix A, a scalar M, and a vector X. 
To see what procedures are in DEMO, we use the ) FNS command. 
(Remember that APL procedures are usually called functions.) 


)FNS 
PROC1 PROC2 

VPROC1 LUIV 

VZ<+X PROC1 Y3U;3V 
L1] U<+3 PROC2 X+Y 
[2] V+0.5 PROC2 xX-Y 
L3] Z<+U PROC2 V 

V 


In DEMO there are two procedures PROC1 and PROC2, the first of which 
we have listed. If we try to execute PROC 1, we run into trouble. 


4 PROC1I 7 
DOMAIN ERROR 
PROC2L1] Z2*«Y*X 

A 


At this point, computation is halted. We can get a better idea of where we 
are with the )SZ command. 


ST 
PROC2L[1] * 
PROC1IL2] 


The )SI command lists the state indicator, which contains a list of the 
procedures currently halted along with the corresponding line numbers, 
most recently halted first. In our example we are stopped at line L1] of 
PROC2, which was called from line L2] of PROC1. It is now possible to 
examine the variables in PROC2 to see what is wrong. This will be dis- 
cussed in Section 5. To clear the state indicator we enter >. 


> 


)STI 


It is possible to bring part or all of a workspace from the library into 


464 APL SYSTEMS 


the active workspace without destroying the contents of the active work- 
space. This is done with the ) COPY command. 


)COPY CLASSLIB ZGCD ZLCM EXPAND 
SAVED 16.59.26 06/19/79 


)FNS 

EXPAND PROC1 PROC2 ZGCD ZLCM 
35 ZGCD 56 

7 
35 ZLCM 56 

280 


Here we have copied the procedures ZGCD, ZLCM, and EXPAND from the 
workspace CLASSLIB into the active workspace. Omitting the list of 
objects causes the entire workspace to be copied. 

We can delete variables and procedures from the active workspace 
with the )FRASE command. 


)ERASE A ZLCM 
) VARS 

M x Ef 2 
)FNS 

EXPAND PROC1 PROC2 ZGCD 


The variables R and S were created by ZGCD. 
If we happen to forget the name of the active workspace, we can find 
out what it is with the )WSID command. 


)WSID 
DEMO 


Suppose now we wish to save the active workspace in the library with- 
out affecting the workspace DEMO still in the library. To do this, we must 
save the workspace with a new name. 


)SAVE DEMO1 
10.55.07 08/09/79 


The )SAVE command places a copy of the active workspace into the library 
but does not change the active workspace. To wipe the active workspace 
clean, we issue the )CLEAR command. 


)CLEAR 
CLEAR WS 


WORKSPACES AND SYSTEM COMMANDS 465 


Clear workspaces do not have a name associated with them. If at some 
later time we decide we no longer need the workspace DEMO in the library, 
we can drop it. 


)DROP DEMO 
10.55.35 08/09/79 
)LIB 
CLASSLIB 
EXAMPLES 


DEMO1 


In order to facilitate the copying of several related objects from a work- 
Space in the library to the active workspace, it is possible to collect to- 
gether a number of arrays and procedures and give a name to the collection. 
The term normally used for such a collection is “‘group”’, but these groups 
have nothing to do with the groups of Chapter 3. To illustrate the forma- 
tion and modification of groups, let us reload the workspace DEMO1. 


YLOAD DEMO1 

SAVED 10.55.07 08/09/79 
)VARS 

M x R 2 
)FNS 

EXPAND PROC1 PROC2 ZGCD 


To form a group named INTEGER consisting of the variables R and S and 
the procedures ZGCD and EXPAND, we enter 


)GROUP INTEGER BR S ZGCD EXPAND 


To get a list of the groups currently defined, we use the )GRPS system com- 
mand. 


)GRPS 
INTEGER 


To recall what is in a group, we use the )GRP command. 


)GRP INTEGER 
EXPAND ZGCD R S 


We may add new members to the group as follows. 


)GROUP INTEGER INTEGER M 
)GRP INTEGER 
EXPAND M ZGCD R S 


466 APL SYSTEMS 


If we erase a group, we destroy the definition of the group and also erase 
the objects in the group. To disperse a group, that is, remove its definition 
without affecting its members, use the )GROUP command with an empty 
list of objects. 


)GROUP INTEGER 
)GRPS 


4. ERROR MESSAGES 


From time to time the system may be unable to complete a computation 
or carry out a system command. When this happens, an error message or 
trouble report is printed at the terminal. The following is a brief descrip- 


tion of the more common error messages. 


DEFN ERROR 
DOMAIN ERROR 


ENTRY ERROR 
INDEX ERROR 
LENGTYT ERROR 
RANK ERROR 


RESEND 


SI DAMAGE 


SYNTAX ERROR 


SYMBOL TABLE FULL 
SYSTEM ERROR 


VALUE ERROR 


WS FULL 


Misuse of V or L. 

Argument not in the domain of the primitive 
operation. 

An invalid character has been transmitted, 
perhaps an improper overstrike. 

Index value out of range. 

Shapes of the arguments not conformable. 

Rank of an argument of a primitive operation 
is wrong. 

Either an error in transmission or too many 
characters on a line. 

One of the procedures in the state indicator has 
been adversely affected by editing or by a 
)COPY or )ERASE. 

The statement is not formed correctly; for 
example, a procedure is used without the 
proper arguments, unmatched parenthesis. 

Too many names in use. 

This is serious. Send as complete a description 
as possible of what happened to the system 
manager. 

Name does not have a value, procedure did not 
return a value, or constant has a value too large 
or too small for internal representation. 

The space available in the active workspace is 


DEBUGGING 467 


too small to complete the task in progress. 
Check )SZ and clear halted procedures if pos- 
sible. 


5. DEBUGGING 


Frequently, a newly defined procedure does not work as intended. This 
section describes some simple techniques for finding out what is wrong. 
As a Start, let us go back to the problem we had in Section 3 with the pro- 
cedures PROC1 and PROC2. 


VPROC1 LOIV 
V Z2<X PROC1 Y:3U;3V 
[C1] U+3 PROC2 XxX+Y 
[2] V+0.5 PROC2 X-Y 
L3] Z<U PROC2 YV 
V 
4 PROC1 7 
DOMAIN ERROR 
PROC2L1] Z<«YxX 
A 


The error message tells us that one of the variables X or Y must be outside 
the domain of the exponential operation. We can examine the values of these 
variables. 


Xx Y 


0.95 3 


Now we see what is wrong. The procedure PROC2 is trying to take the 
square root of a negative number. Since we have no information about 
what PROC1 and PROC2 are supposed to do, we cannot tell if the problem 
is in the definition of PROC‘, in the definition of PROC2, or in our choice 
of the arguments 4 and 7. Before leaving this example, however, it is useful 
to remark that we can list and edit PROC2 at this point. 


VPROC2 [O]V 
V Z<«X PROC2 Y 
tid Z<YxRX 
Vv 


However, without first clearing the state indicator, we cannot list or edit 
PROC 1. 


468 APL SYSTEMS 


VPROC1 LUIV VPROC1 LOIV 
DEFN ERROR V Z2<X PROC1 Y3U3V 
VPROC1 [1] U+3 PROC2 X+Y 
A [2] V+0O.5 PROC2 X-Y 
> L3] Z<+U PROC2 V 
Vv 


Only the most recently halted procedure can be edited. 

Sometimes a procedure does not produce any error messages but seems 
to be running a long time. We can temporarily stop execution and check 
on how things are going. In Section Al.4 we defined a procedure SUMSQX 
for computing sums of squares. 


VSUMSQ@X LUIV 
VY S+SUMSQX N31 
[1] S<I<0 
L2] LOOP:7>(N<I<I+1)/0 
L3] S<S+L[*2 
C4] +LOOP 
V 


Suppose we decide to compute the sum of the squares of the first 10° posi- 
tive integers using SUMSQX. 


SUMSQX 16 


We wait a few seconds, but there is no response from the system. Curious, 
we hit the ATTN or BRK key to stop execution and look at the current 
value of J and S. 


SUMSQXL 3] S 
I 455261135 
1110 


Everything seems to be all right, so we resume execution at the line where 
we stopped. 


+3 


We wait a while longer and decide to check our progress again. 


SUMSQXL 2] S 
I 3375587894 
2163 


DEBUGGING 469 


By now we realize that it will take too long to complete the calculation 
and we abandon the effort by clearing the state indicator. 


> 


It is possible to keep track of the execution of a procedure without 
continually stopping it and restarting it. This is done with the trace facility. 
To trace the execution of SUMSQX, we create a special variable TASUMSQX 
that lists the line numbers at which we want tracing information. 


TASUMSQX<1 2 3 4 


Here we are requesting tracing information at every line. We now begin 
execution of the procedure. 


SUMSQX 3 
SUMSQ@XL1] 0 
SUMSQXL2] +3 
SUMSQXL3] 1 
SUMSQXL4] +2 
SUMSQXL2] +3 
SUMSQXL3] 5 
SUMSQXL4] +2 
SUMSQXL2] +3 
SUMSQXL3] 14 
SUMSQXL4]J +2 
SUMSQXL2] +0 
14 


For each line number in TASUMS@X the system prints the procedure name, 
the line number, and the result of that line. Branches are indicated by a 
branch arrow followed by the number of the next line to be executed. Trac- 
ing is terminated by assigning 10 to TASUMSQ@Q xX. 


TASUMSQX<10 
SUMSQX 3 
14 


In addition to the trace facility there is also a stop facility. If we want 
SUMSQX to stop every time it is about to execute line 4, we assign 4 to the 
special variable SASUMSQ xX. 


470 APL SYSTEMS 


SASUMSQX<4 +4 
SUMSQX 3 
SUMSQXL4] 

SUMSQXL4 ] S 

S 14 
1 >4 

+4 14 

SASUMSQX<10 

SUMSQ@XL4] 

S 


fs) 


In this example we have examined the value of S each time the procedure 
stopped and then restarted it again. 

There is yet another way to observe the execution of a procedure, 
but this requires editing the procedure to temporarily insert the characters 
LI+ to indicate that results are to be printed at the terminal. 


VSUMSQ@X L307] SUMSQX 4 
[3] S<S+L[*2 1 
2 5 
[3] O<+S<+S+I*2 14 
[4 ] V 30 
30 


The modified SUMSQX prints out each value of S as it is computed in line 
L 3]. The final 30 in the display is the result that would be printed without 
any tracing. 


6. PROGRAMMING EFFICIENCY 


A given computation can usually be formulated in many ways in APL. 
Some formulations may execute faster than others, and some may require 
less space than others. Usually the initial concern is to find something that 
works. However, if the computation is to be repeated often, it becomes 
important to devise a method of performing the computation that is ef- 
ficient in its use of both time and space. In this section we present some 
techniques for efficient APL programming that have been used in the pro- 
cedures in CLASSLIB. 

We start with a brief discussion of the way APL stores entries in arrays. 
Let us define five vectors of length 1000 and use the system variable UWA 
to find out how much space was used to store each vector. 


PROGRAMMING EFFICIENCY 47] 


LWA LWA 

91032 78856 
A<+1000p1 D+1000p1F10 
LWA LWA 

90888 70840 
B<+1000p2 BH<«+10000'xX'! 
LWA LWA 

86872 69824 


C<+1000p2.5 


The number of bytes required to store 4, B, C, D, and £ is 144, 4016, 
8016, 8016, and 1016, respectively. What caused the difference? The answer 
is that APL systems have at least four different methods of storing arrays. 
Different methods are used for logical arrays, for arrays of small integers, 
for arrays of fractions and large integers, and for arrays of characters. Let 
us consider another example. 


OWA OWA 
69824 61808 
F<1000p42 


This example seems to contradict what was just said. The entries in ” seem 
to be the same as the entries in B, but roughly twice as much space was 
used for / as was used for B. The problem is that APL systems usually con- 
sider the result of any division to be a fraction, even if it happens to be an 
integer. We can force the system to recognize 4+2 as an integer by the use 
of the floor operation. 


OWA OWA 
61808 57792 
G+1000pL472 


The foregoing discussion can be summarized as follows. Be aware of 
the different methods that APL systems use to store arrays and of tech- 
niques for helping the system find the best way for your specific needs. 

As an illustration of how a different formulation of a computation 
can reduce the time required to perform it, let us look at the problem of 
constructing the inverse @ of the permutation P. Probably the most natural 
way to obtain @ is the following. 


Q+AhP 


However, if ? is very long, then there is another method that requires less 
computer time on most systems. 


472 APL SYSTEMS 


Q<+(pP)p2 
QLPJ<10P 


The use of 2 in the first line, instead of O or I, is to get the system to allo- 
cate the right amount of space for @ the first time. 

Some of the APL operations are somewhat slow when used with large 
arrays. These include 4, Y, «, and dyadic 1. These operations should be 
used only when there is no faster alternative available. 

Here is another example of how a little ingenuity can help a great 
deal. Suppose A is a 4-by-4 matrix and X and Y are vectors of length 1000 
with entries in 14. We wish to compute the vector Z such that ZLJI] is 
ALXLI];YLIJJ]. Let us assume LIO is 0. Then the ‘‘obvious’’ way to 
get Z 1s 


Z<“0 OWALX;Y] 


However, ALX; Y ] is a matrix with 1 million entries, and trying to compute 
it will clearly cause a WS FULL condition on all but the largest APL sys- 
tems. We could write a procedure containing a loop that computes the 
entriesin Z one at a time. This would be efficient as far as space is concerned, 
but it would be quite slow. 

There is another approach that uses the ravel .A. To illustrate this 
method, let us consider a specific example. 


O+A<+4 492 130144 2°30 4123241 
“1 3:0 

4 2 3 
“44 2 

2 4 41 
OI0<0 
X<+?100004 
Y<?100004 


WD OF RN 


We begin by forming the ravel of A. 
B<,A 


Our next task is to find a quick method of locating an entry in B given the 
row and column indices of the corresponding entry in A. The index in B 
of ALIT;JJisJ+4xT. 


AL 233] - AC132] 


BL3+4x2] BL2+4x1] 


PROGRAMMING EFFICIENCY 473 
From this we see that the vector Z can be constructed very easily. 


Z<+BLY+4xXx] _ ALXL237];YL237]] 
- Z0 237] 3 
3 


An equivalent construction, which might be more convenient with arrays 
of higher rank, is the following. 


pW+xX,[ 0.5]Y _ Z[237] 
2 1000 3 
Z<BLYLW | 


In order to determine which of several alternative approaches to a given 
computation is the fastest, each alternative may be timed using the system 
variable IAI as described in Section 2. 


____ Appendix , 


THE SUPPLEMENTAL 
WORKSPACES 


Two APL workspaces have been created to supplement this book. The 
workspace CLASSLIB contains procedures for carrying out various alge- 
braic computations. All of the procedures referred to in the text are in 
CLASSLIB. The workspace EXAMPLES contains arrays that describe vari- 
ous algebraic objects. These arrays appear in the sample dialogues and in 
some of the exercises. All of the dialogues assume that the contents of 
both CLASSLZIB and EXAMPLES are present in the active workspace. One 
way to arrange this is to execute the following system commands. 


)LOAD CLASSLIB 
)COPY EXAMPLES 


However, if space is limited, it may be necessary to copy only the pro- 
cedures and arrays that are needed at the moment. 

What follows is a brief introduction to the organization of CLASSLIB 
and EXAMPLES. More complete information, including procedure listings, 
can be found in the CLASSLIB user’s manual, which is published sep- 
arately. 


1. CLASSLIIB 


The workspace CLASSLIB contains over 200 procedures related to alge- 
braic computation. These procedures are named using a special naming 
convention that will be described shortly. Illustrations in the text of the use 
of individual procedures may be located using the index. There are also four 


global variables, two of general importance and two used by only one 
procedure. 


474 


CLASSLIB 475 
The global variables are: 


EPSILON A scalar, normally set to 107!%, which controls the ex- 
tent to which small entries in an array of real or com- 
plex numbers are set to 0. 


NOTEST A logical scalar, normally set to 0. When WOTEST is 
set to 1, certain validity checking of arguments is skip- 
. ped, thereby speeding up some computations. 
BIGPRIMES A vector listing the large primes used by MPZDET. 


BIGINV A vector of positive integers used by MPZDET. For 
I > 1 the value of BIGINVLT] is the inverse modulo 
BIGPRIMESLI] of x/(I2-1)+BIGPRIMES. 


With seven exceptions, the names of the procedures in CLASSLIB 
consist of two parts, a prefix indicating the algebraic system in which the 
computation is to be performed and a suffix describing the nature of the 
computation. For example, the prefix RX refers to R[X] and the suffix 
SUM means addition in a ring. Thus RXSUM is used to add arrays of real 
polynomials. 


The seven exceptions to the naming convention are DAQ, DARV, 
DAZV, DERR, EXPAND, EXPANDV, and TRAV. The first three take a single 
argument and construct a character matrix that, when displayed at the ter- 
minal, makes clear the algebraic object represented by the argument. The 
procedure TRAV performs the analogue of the monadic transpose on ar- 
rays that are considered to arrays of vectors, such as arrays of complex 
numbers or arrays of polynomials. The remaining procedures, DERR, 
EXPAND, and EXPANDV, carry out specialized tasks for other procedures 
and are not meant to be called directly. 

The following is a list of the prefixes and the algebraic systems to 
which they refer. Some of the systems are defined by global variables that 
must be created before the procedures with those prefixes can be used. 
References are given to sections in the text where details may be found 
concerning the way individual objects in the systems are represented. 


C The field of complex numbers. (4.3) 

FR The finite ring described by the variables FRPLUS, 
FRTIMES, FRNEG, and FRINV. (4.3) 

FRX The ring of polynomials over the ring for FR. (4.4) 

GAUSS The ring of Gaussian integers. (4.2, 4.9) 

GP The symmetric group on 11, where WV is the last dimen- 


sion of the argument(s). (3.7, 3.8) 


476 THE SUPPLEMENTAL WORKSPACES 


GT 


ZNA 


ZNX 
ZX 


The finite group described by the variables GTABLE, 
GTINV, and GTTIO. (3.2) 


The ring of integers with multiple precision. (2.5) 

The field of rational numbers with exact computation. 
(4.3) 

The field of real numbers. (4.3) 


The real algebra described by the structure constants 
RASC, (5.4) 


The ring of real polynomials. (4.4) 
Sets, mostly sets of nonnegative integers. (Chapter 1) 
The ring of integers. (Chapter 2, 4.3) 


The integer algebra described by the structure con- 
stants ZASC, (5.4) 


The ring of integers module V. Some procedures require 
N to small and/or prime. (4.3) 


The algebra over the ring for ZN described by the struc- 
ture constants ZNASC. (5.4) 


The ring of polynomials over the ring for ZN. (4.4) 
The ring of integer polynomials. (4.4) 


The following is a list of the suffixes, the operations to which they 
refer, and references to sections containing examples. 


ALLORB 
CIREM 


CHV 
CON 
CYCIN 


CY COUT 


DEGREE 
DET 


DIFF 
EQREL 


EVAL 


Compute all orbits of a permutation group. (3.8) 


Solve simultaneous congruences, as in the Chinese 
Remainder Theorem. (2.4) 


Compute characteristic vectors of sets. (1.3) 
Compute complex conjugate. (4.3) 


Convert a permutation from cycle to vector notation. 
(3.7) 

Convert a permutation from vector to cycle notation. 
(3.7) 

Compute the degree of a polynomial. (4.4) 

Compute the determinant of a square matrix. (4.6) 


Compute the difference in a ring. (2.5, 4.3, 4.4) 
Test whether a square logical matrix defines an equiv- 
alence relation. (1.2) 


Evaluate a polynomial at a point in the coefficient 
ring. (4.4) 


FACTOR 
FEL 


GCD 
INTERP 
INV 
ITRRED 


LCM 
LCON 


LSYS 
MAG 


MATINV 
MATPROD 
NEG 
NORM 
ORBIT 
POWER 
PRIMES 


PROD 


QUOT 


RCON 


REDUCE 


REM 


ROWREDUCE 
SGN 


SGP 


SORT 


EXAMPLES 477 


Factor an element of a UFD. (2.4, 4.10, 7.4) 


Select the first element in a set of nonnegative in- 
tegers. (1.4) 


Compute a greatest common divisor. (2.2, 4.10) 
Interpolate a polynomial. (4.12) 
Compute an inverse in the group of units of a ring. (4.3) 


Find ‘“‘small’’ irreducible elements in certain Euclidean 
domains. See also PRIMES. (4.10) 


Compute a least common multiple. (2.2) 


Compute the characteristic matrix for left congruence 
modulo a subgroup. (3.3) 


Solve a linear system of equations. (6.4, 6.7) 

Compute the magnitude of a real or complex number. 
(2.5, 4.3) 

Compute the inverse of a matrix. (4.7, 5.3, 6.2) 
Compute the product of two matrices. (4.5) 

Compute the negative in a ring. (4.3) 

Compute the norm of a complex number. (4.3) 
Compute a single orbit of a permutation group. (3.8) 
Compute a nonnegative power in a ring. (2.5, 4.3) 


Find “small’’ prime elements in certain Euclidean 
domains. See also [RRED. (2.4) 


Compute a product in a group or a ring. (2.5, 3.2, 
4.3,4.4) 

Compute a quotient in a Euclidean domain or other 
ring. (2.1, 4.3, 4.9) 

Compute the characteristic matrix for right congruence 
modulo a subgroup. (3.3) 


Reduce a matrix over a Euclidean domain using row and 
column operations. (6.5) 


Compute a remainder in a Euclidean domain or in cer- 
tain other rings. (2.1, 2.5, 4.9) 


Row reduce a matrix over a Euclidean domain. (6.2) 


Compute the signum of a real number or the sign of a 
permutation. (2.5, 3.8) 


Determine all elements in a subgroup from a set of gen- 
erators. (3.3, 3.8) 


Sort the entries of a vector into increasing order and 


478 THE SUPPLEMENTAL WORKSPACES 


remove duplicates. (1.1) 


SUB Construct all subsets of a given size in 1. (1.1) 
SUM Compute a sum in aring. (2.5, 4.3, 4.4) 
SYMG Construct all elements of a symmetric group. (3.7) 


In addition to the suffixes just listed, there are four suffixes that de- 
scribe operations of a somewhat different type. These are the suffixes TEST, 
NRMLZ, INIT, and CHECK. Procedures ending in TEST are used to decide 
whether a given APL array is a valid representation for an array with entries 
in a particular algebraic system. A value of O is returned if the APL array 
does represent an array over the system, and a value of 1 is returned when 
the APL array is not a valid representation. For some algebraic systems, 
such as the integers modulo J, a given array may have many representations 
by APL arrays. Procedures ending in VRMLZ test for validity and, when the 
argument is valid, return the standard representation. For example, @VAMLZ 
reduces fractions to lowest terms. 

Several of the prefixes refer to algebraic systems defined by global 
variables. Procedures with the suffix INIT may be used to initialize these 
global variables. Procedures with the suffix CYECK can be used to decide if 
an atray or pair of arrays really defines a system of the appropriate type. For 
example, GTC 7ECK checks whether an array is a group table. 

The procedures in CLASSLIB print error messages if they detect 
errors. In addition, if the system decides that something is wrong, the system 
will print an error message. One frequent cause of difficulty is the absence 
from the active workspace of a necessary procedure because it was erased 
or not copied in order to save space. It is a good practice to have all pro- 
cedures with a given prefix in the active workspace if you expect to use any 
one of them. The procedures DERR, EXPAND, and EX PANDY should also be 
present. If procedures with the prefix ZV, ZNX, or @ are being used, then the 
procedures with the prefix Z should be in the active workspace too. 


2. EXAMPLES 


The workspace EXAMPLES contains copies of the arrays BIGPRIMES 
and BIGINV that are normally in CLASSLIB. In addition, there are arrays 
representing various algebraic objects. The first part of the name indicates 
the type of object according to the following scheme. 


B Binary operation. 
DESIGN Block design. 
GP Set of permutations. 


E Equivalence relation. 


G 

H 

INV 
PLUS 
TIMES 
MPLUS 
MTIMES 


EXAMPLES 


Group table. 

Homomorphism. 

List of group inverses. 

Addition table of a finite ring. 
Multiplication table of a finite ring. 
Addition table for a module. 
Module action for a finite ring. 


479 


BIBLIOGRAPHY 


Other introductory algebra texts 


Garrett Birkhoff and Saunders MacLane, A Survey of Modern Algebra, 
4th ed. New York: Macmillan, 1977. 


I. N. Herstein, Topics in Algebra, 2nd ed. New York: Wiley, 1975. 


More advanced algebra texts 


Nathan Jacobson, Basic Algebra One and Basic Algebra Two. San Francisco: 
W. H. Freeman, 1974, 1980. 


Serge Lang, Algebra. Reading, Mass.: Addison-Wesley, 1965. 


B. L. Van der Waerden, Algebra, 2 vols, 7th ed. New York: Frederick Ungar, 
1970. 


APL reference 


Leonard Gilman and Allen G. Rose, APL: An Interactive Approach, 2nd 
rev. ed. New York: Wiley, 1976. 


In a class by itself 


Donald E. Knuth, The Art of Computer Programming, Vols. 1 and 2, 2nd 
ed. Reading, Mass.: Addison-Wesley, 1974 and 1981. 


Articles referred to 
Elwyn R. Berlekamp, “Factoring Polynomials over Large Finite Fields.”’ 
Math, of Comp., 24 (1970), 713-735. 


John Brillhart, ‘“Note on Representing a Prime as a Sum of Two Squares.”’ 
Math. of Comp., 26 (1972), 1011-1013. 


John D. Dixon, “The Probability of Generating the Symmetric Group.”’ 
Math. Zeit., 110 (1969), 199-205. 


480 


BIBLIOGRAPHY 481 


Daniel Gorenstein, ‘““‘The Classification of Finite Simple Groups.”’ Bull. 
Amer. Math. Soc. (New Series), J (1969), 43-199. 

Theodore S. Motzkin, ‘“The Euclidean Algorithm.’’ Bull, Amer. Math. Soc., 
55 (1949), 1142-1146. 

Charles C. Sims, ‘“‘“Group-Theoretic Algorithms, a Survey.’’ Proc. Interna- 
tional Congress of Mathematicians, Helsinki (1978), 979-985. 


INDEX 


Abel, Niels, 79 mixed, 431—442, 448-455 

Abelian group, see Groups, abelian relational, 419 

Adjacency matrix, 37, 39 Right-to-Left Rule for, 407, 408 

Adjoint matrix, 199 scalar, 415—423 

Algebraic element, 349-353 procedures: 

Algebraic extension, 352 branching in, 427, 428 

Algebras: debugging, 467—470 
commutative, 364 description of, 423-431 
homomorphism of, 269 editing, 456-459 
properties of, 268—280 global variables, 425, 426 
quaternion, 276, 277, 280 headers, 424, 425 
quotient, 269—273 local variables, 425, 426 

Algorithm, 1 loops, 427—430 

Alternating group, 121 recursive, 428, 429 

Annihilator, 310 statement labels, 429, 430 


APL: proposition, 7 
arrays: reduction, 442—444 
of characters, 413, 414 scan, 444 


description of, 410—415 
indices for, 410, 411, 414 
names of, 409 
rank of, 411 
shape of, 412 
storage of, 470, 471 
definition mode, 424 
error messages, 466, 477 
execution mode, 424 
index origin, 413 
inner products, 445-447 
keyboard, 406 
language, 2, 405—455 
Outer products, 447 
primitive operations: 
arithmetic, 415—419 
description of, 407, 408 
dyadic, 408 
logical, 419, 420 


system commands, 462—466 
systems, 2, 405, 456—473 


system variables, 413, 459-462 


type font, 3 

workspaces, 462—466 
Arithmetic: 

modular, 66, 67 


multiple-precision, 62, 63, 66, 67, 76, 91, 


160 
Ascending chain condition: 
for ideals, 214, 216 
for submodules, 249 
Associate, 217—221 


Associativity, see Binary operation 
Augmented matrix, 315-318, 323 


Automorphism: 
of a graph, 125, 126, 132 


of a group, 134, 138, 139, 141 


of a ring, 159 


483 


484 INDEX 


Base, 63, 64, 66 
Basis, see Modules, free 
Berlekamp’s algorithms, 364—372 
BIGPRIMES, 191 
Bijection, 23 
Binary operation: 
associative, 20, 27, 49, 68-71, 74, 75, 77, 
79, 81 
commutative, 20, 27, 49, 68-70, 74, 79 
computing powers of, 28, 62, 71, 74—77, 
80 
properties of, 19, 20, 27, 28, 49, 51, 68- 
77, 83 
table, 69, 70, 75-77, 83 
Binary power algorithm, 74—77, 88 
Binomial Theorem, 152, 358, 364 
Block: 
of a block design, 33 
of a partition, 10, 16, 19 
Block design, 33, 38 
B1, B2, B3, 75 
Boole, George, 153 
Boolean ring, 153 


Cancellation laws, 80, 93 
Canonical form: 
Jordan, 402—404 
rational, 380—395 
Cardinality, 7 
Cartesian product, 9, 19, 109 
Cayley, Arthur, 116 
Cayley-Hamilton Theorem, 388 
Cayley’s Theorem, 116 
CCONT, 164, 166 
CDIFF, 163, 165 
Center, 134, 138 
Centralizer, 90, 103, 124, 135, 137-139 
Characteristic, 157, 206 
Characteristic vector, 20, 21, 26—32, 42, 90, 
122, 131 


Chinese Remainder Theorem, 58, 60, 66, 107 


CINV, 163, 165 

CLASSLIB, 2, 3, 474-478 
CMAG, 164, 166 

CNORM, 164, 166, 213 
Codomain, 18, 22, 23 

Cofactor, 197—202, 205 
Column operating, see Matrix 
Commutative diagram, 101, 210 


Companion matrix, 385, 386, 390, 393, 394 
Complement, 8 
Complex number, 2, 6, 146, 147, 150, 151, 
163, 164, 166 
Composition: 
of functions, 22, 25, 70, 72 
of homomorphism, 94 
of relations, 13, 14 
Congruence: 
class, 51, 52, 69, 72, 73, 83, 84, 88, 91, 
93 
of integers, 50—54 
modulo a subgroup, 87-91, 98, 99, 133, 
137 
simultaneous, 57—60 
solution of, 52—58, 73 
Conjugacy, 133-143 
Conjugate: 
complex numbers, 164 
in a group, 133 
Content, 230, 234 
Contragredient, 310 
Coordinate vector, 254, 264 
Corner entries, 282—284, 317 
Correspondence, one-to-one, 23 
Coset, 87-89, 91, 98-100, 102, 121, 124 
CPOWER, 163, 165 
CPROD, 163-165, 213 
CQUOT, 163, 165 
Cramer, Gabriel, 322 
Cramer’s Rule, 322 
CSUM, 163, 165, 213 
Current abstract group, 78 
Cycle, 111-115, 135 


DAQ, 162, 178, 179 

DARV, 163-165, 172, 174 

DAZYV, 163, 165, 173, 179, 333, 361, 383, 
384, 389-391 

Dedekind, Richard, 2 

Degree, see Fields, extension; Polynomials 

De Morgan’s Laws, 26 

Derivative, 355—357 

Descartes, René, 9 

Descending chain condition, 216 

DESIGN1, 33, 40 

DESIGN2, 38 

DESIGN3, 38 

DESIGN4, 38, 40 


DESIGNS, 38, 40 
Determinants, 181—191, 195—203, 256 
Diagonal: 
of a matrix, 14 
of XxX, 9, 110 
Dihedral group, 123 
Dimension, 254 
Direct product, 104-110. See also Direct sum 
Direct sum: 
of groups, 110 
of modules, 247, 248, 259 
of rings, 150, 152 
Distributive Law, 146, 153 
Division property, 210, 211 
Division ring, 150, 157 
Divisor, 42, 43, 50, 54, 216. See also 
Elementary divisor; Greatest common 
divisor 
Domain, see Euclidean domain; Function; 
Integral domain; Principal ideal 
domain; Unique factorization domain 
Dual basis, 308, 310 
Dual space, 308-310 


Eigenspace, 396, 399 
Eigenvalue, 395—404 
Eigenvector, 395—404 
Eisenstein, Ferdinand, 234 
Eisenstein’s Criterion, 233—234 
Elementary divisor, 338, 339 
Endomorphism, see Groups; Modules 
EPSILON, 161, 164, 173 
Equivalence, see Matrix 
Equivalence class, 16, 50, 51, 87 
Equivalence relation, 14—18, 23, 26, 32, 50, 
51, 87, 88, 95, 133, 217 
Euclidean algorithm, 45, 46, 48, 49, 66, 88, 
218, 228 

Euclidean domains: 

modules over, 281—347, 373 

properties of, 210-216, 218, 221, 229 
Euclidean norm, 211—216 
Euler, Leonhard, 79 
Euler phi-function, 79, 82, 88, 108, 110 
Evaluation: 

of linear functionals, 309 

of polynomials, 171, 173, 174 
EXAMPLES, 2, 3, 18, 33. 38, 68, 72, 75, 


INDEX 485 


79, 93, 97, 114, 149, 242, 474, 478, 
479 
B25, 18 


Factorial power, 260 
Factorization: 
of Gaussian integers, 225-227 
in integeral domains, 216-234 
of integers, 54—60 
of polynomials, 224, 225, 237-239, 364— 
372 
Family, see Sets 
Fermat, Pierre de, 226 
Fermat’s Theorem, 227 
Fields: 
extension: 
algebraic elements in, 349-353 
degree of, 348-353 
finite, 348—350 
minimal polynomials of elements in, 
350-354 
properties of, 348—357 
transcendental elements in, 349, 353 
finite (Galois), 158, 357—364 
of fractions, 206—210, 229, 232 
multiplicative groups of, 352, 353 
of rational functions, 209, 239 
subfield, 154 
theory of, 1, 150, 151, 206-210, 348-372 
F1, F2, F3, 97 
FRDET, 185, 191, 206 
FRDIFF, 165 
FRINIT, 164-166, 173, 360, 362 
FRINV, 164—166, 173, 179 
FRMATPROD, 179 
FRNEG, 164-166, 173, 179 
FRPLUS, 164, 166, 173, 179 
FRPOWER, 165 
FRPROD, 165, 362 
FRSUM, 165 
FRTIMES , 164, 166, 173, 179 
FRXEVAL, 173, 174 
FRXPROD , 173 
FRXSUM, 173 
Functions: 
bijective, 23 
characteristic, 20 
codomain (range) of, 18, 22, 23 
composition of, 22, 25, 70, 72 


486 INDEX 


Functions (Continued) 

domain of, 18, 19, 21, 25 

graph of, 22 

identity, 19, 23, 71, 93 

image under, 18, 19, 97 

injective, 23, 25, 75, 100 

inverse image under, 23, 27, 93, 97 

notation for, 18 

one-to-one, 23, 27 

polynomial, 171, 175 

restriction of, 22 

surjective, 22, 23, 25, 75 

well-defined, 51 
Fundamental Theorem of Algebra, 225 
Fundamental Theorem of Arithmetic, 54, 218 
Fundamental Theorem of Finitely Generated 

Abelian Groups, 108, 338 


Galois, Evariste, 1 
Galois field, 359 
Gauss, Carl Friedrich, 154 
GAUSSFACTOR, 227-229 
Gaussian integers, 154, 159, 213, 216, 221, 
225-229, 334 
Gauss’ Lemma, 230 
GAUSSQUOT, 213 
GAUSSREM , 213, 218, 228 
General linear group, 177 
GPALLORB, 122, 125, 132 
GPCYCIN, 113, 120 
GPCYCOUT, 113, 122 
GP8, 114-117, 120 
GPORBIT', 122 
GPSGN, 119, 120, 184 
GPSGP, 117, 121, 124 
GPSYMG, 114, 184 
Graph, 35-39, 127-133, 140 
Greatest common divisor, 43—50, 52, 54, 57, 
58, 66, 73, 88, 90, 103, 109, 217, 
218, 227, 228 
Groups: 
abelian (commutative), 79, 81, 82, 84, 90, 
91, 95, 96, 103, 108, 109, 144, 241, 
244, 331 
additive notation for, 81—83, 92 
alternating, 21 
automorphisms of, 134, 138, 139, 141 
cancellation laws in, 80, 93 
center of, 134, 138 


conjugacy in, 133-143 
cyclic, 84, 85, 87, 89, 90, 93, 101, 102 
dihedral, 123 
direct product of, 104-110 
direct sum of, 105, 110 
endomorphism of, 153 
finite, 79, 85, 87, 88, 90, 92 
finitely generated, 84 
general linear, 177 
homomorphism of: 
kernel, 99-101, 104, 106, 121, 136 
properties of, 92—97, 99-104, 106, 116, 
118, 120, 124, 136, 154 
trivial, 93 
isomorphic, 94 
isomorphism of, 94—96 
isomorphism theorems for, 100, 103, 134, 
138 
multiplicative notation for, 80, 81, 93 
nonabelian, 79 
order of, 79 
order of an element in, 84, 90, 91, 96, 103, 
104, 115, 138 
of permutations: 
isomorphism of, 126 
orbit of, 121-128, 132, 140 
permutation character of, 126, 140 
properties of, 116-127, 141 
stabilizer in, 83, 121, 124, 141 
p-group, 138, 139 
of prime order, 87 
quotient, 100, 103, 104, 154 
regular representations of, 116, 117, 124, 
139 
ring of, 153, 279 
simple, 136-138, 140 
subgroup: 
additive, of integers, 43, 44, 47, 48, 83 
congruence modulo a, 87-89, 91, 98, 99, 
133, 137 
cosets of, 87—89, 91, 98-100, 102, 121, 
124 
generators of, 84, 86, 90, 92, 117, 124, 
125 
index of, 87 
nontrivial, 83 
normal, 97—104, 106, 136, 139, 154, 246 
properties of, 83-93, 97-104, 109, 118, 
143, 154 


p-subgroup, 142 
Sylow, 142-145 
symmetric, 79, 81, 83, 110, 114, 125, 183, 
188 
table, 77-79, 83, 85, 90, 93, 95, 96, 104, 
110, 116, 117, 126, 141 
theory of, 1, 5, 68, 77-145 
G6, 68, 69, 71, 72, 75-78, 82, 85, 90, 93, 
96, 153 
G60, 79, 83, 90, 97, 104, 137, 140, 141, 145 
GTABLE, 77, 78 
GTCHECK, 78, 79 
GTINIT, 78, 79, 86, 89, 92, 97, 137 
GTINV, 77, 78 
GTIO,77, 78 
GTLCON, 89, 92, 97, 137 
GTPROD, 78, 79 
GTRCON, 89, 92, 97, 104, 137 
GTSGP, 86, 89, 92, 97, 137 
G24, 79, 83, 86, 89, 90, 92, 93, 96, 97, 104, 
134, 140 


Hilbert, David, 2 

Holomorph, 139 

Homomorphism, see Algebras; Groups; 
Modules; Rings 

Horner’s method, 53 

H24T06, 93, 97 


Ideal, see Rings 
Identity element, 27, 71, 72, 78, 80, 81, 83, 
93 
Image, 18, 19, 97 
Index, 87 
Induction, mathematical, 2, 41 
Injection, 23 
Integers: 
additive subgroup of: 
definition, 43, 44, 47, 48, 83 
nonnegative generator of, 44, 45, 50 
composite, 54 
congruence of, 50—54 
divisibility of, 41-43 
division of: 
integral quotient, 41, 46 
remainder, 41 
factorization of, 54—60 
partition of, 135 


INDEX 487 


prime, 54—60, 87, 88, 91, 92, 226, 227 
relatively prime, 47, 52, 57, 59, 66, 73, 
76, 107, 108 
thecy of, 2, 6, 41-67, 146 
see also Gaussian integers 
Integral domain, 151, 152, 157, 169, 207— 
210, 216, 218 
Integral quotient, 41, 46 
Interpolation: 
definition, 234 
factorization using, 237—239 
Lagrange formula for, 235, 239, 315 
as solution of linear system, 313 
Inverse, 72-81. See also Matrix; Rings, units 
in 
Inverse image, 23, 27, 93, 97 
Inversion, 119, 125 
INV24, 79, 89, 92, 93 
INV6, 72, 75, 93 
INV60, 79 
Irreducible element, 218—225, 228 
Isomorphism: 
of graphs, 36, 39 
of groups, 94—96 
of modules, 247, 323~—326, 331—334 
of permutation groups, 126 
of rings, 155 
Isomorphism theorems: 
for groups, 100, 103, 134, 138 
for modules, 247, 308 
for rings, 157 


Jordan, Camille, 400 
Jordan block, 400—403 


Kernel, see Groups, homomorphisms of; 
Rings, homomorphisms of; Modules, 
homomorphisms of 


Lagrange, Joseph-Louis, 87 

Lagrange’s Theorem, 87, 123, 141 

Laws of exponents, 71, 74, 80, 92 

Least common multiple, 48, 49, 57, 90, 91, 
109 

Linear combination, 245, 248—250 

Linear dependence, 249, 254, 258 

Linear functional, 308 

Linear independence, 249-252, 258, 259, 312 


488 INDEX 


Linear transformations: 
minimal polynomials of, 377, 378, 385-— 
388, 394, 397, 398 
properties of, 246, 307, 373-404 
similarity of, 373—395 
see also Modules, homomorphism of 
Lucas-Lehmer test, 61, 63, 66, 67 


M, 59 
Map, see Function 
Matrix: 
characteristic, 13, 18, 26, 27, 32, 89, 97, 
133 
column operations on, 196, 197, 205 
companion, 385, 386, 390, 393, 394 
definition, 19, 20 
diagonal, 397 
diagonal of, 14 
diagonalizable, 397, 398 
elementary, 194—196, 204—206, 286, 295, 
325 
elementary divisors of, 338, 339 
equivalence of, 324-326 
of a homomorphism, 264—268, 375 
inverse of, 203, 204, 298, 301 
lower triangular, 187 
minimal polynomial of, 377, 378, 385-388, 
394, 397, 398 
permutation, 195, 196, 205 
product, 175-181, 187, 340, 343 
rank of, 339 
reduced, 327—336, 342 
ring, 148, 149, 175-181, 242 
row-echelon form for, 282—297 
row equivalence of, 281—302, 315 
row operations on, 191-197, 204—206, 
285—290 
row reduced: 
over a field, 292—294, 317 
over the integers, 294, 295 
over a polynomial ring, 295 
row reduction of, 288—290 
similarity of, 373—395 
trace of, 393 
transition, 254—257, 260, 310 
transpose of, 178, 335 
upper triangular, 186 
see also Canonical form 


Mersenne, Marin, 60 
Mersenne primes, 60, 66 
Minor, 201, 202 
Mobius, August Ferdinand, 363 
Mobius function, 363 
Modules: 
cyclic, 245, 331 
cyclic decomposition of, 331, 336-343 
direct sum of, 247, 248, 259 
endomorphism of, 261 
over Euclidean domains, 281—347, 373 
finitely generated, 245, 257, 323-336 
finite presentations of, 324 
free: 
basis of, 250, 251, 256 
change of basis in, 254—256, 265-268 
properties of, 249-260 
rank of, 254 
standard basis of, 250, 256, 257 
standard invariants of, 282—284, 290, 
291 
submodules of, 281—284 
homomorphisms of: 
kernel of, 246, 311, 321 
matrix of, 264—268, 375 
properties of, 246-248, 260-268 
isomorphism of, 247, 323—326, 331-334 
isomorphism theorems for, 247, 308 
quotient, 246 
simple, 267 
submodules, 244—249, 252, 257, 258, 281- 
284, 312, 322 
theory of, 1, 241-280 
trivial, 242 
see also Vector spaces 
Modulus, 50 
MPLUS , 242, 243 
MPZDET, 185, 191 
MPZDIFF, 62, 63 
MPZMAG, 63 
MPZPOWER, 62, 63, 76 
MPZPROD, 62, 63 
MPZREM, 62, 63, 66 
MPZSGN, 63 
MPZSUM, 62 
Monoid, 72—77 
Multiple, 42, 48-50. See also Least common 
multiple 
Multiple precision. See Arithmetic 


N, 76, 173, 179, 185, 190, 191, 203, 218, 
222-224, 270, 271, 305, 306, 316, 
318, 320, 333, 360, 365, 370, 371, 
372, 377, 383, 388, 389, 396, 399, 
402 

Natural map, 19, 101, 104, 156, 157 

Natural numbers, 6 

Normalizer, 103, 104, 137, 143 


Orbit, 121-128, 132, 140 
Order, see Groups 


Partition, see Integers, Sets 
Permutation character, 126, 140 
Permutation group, see Groups of 
permutations 

Permutations: 

cycle of, 111-115, 135 

cyclic, 111, 136 

even, 121, 125, 136 

inversion in, 119, 125 

odd, 121, 125, 136 


of a set, 23, 25-28, 70, 76, 79, 80, 82, 83, 


110-116, 131 
sign of, 118-121, 125, 183-189 
p-group, 138, 139 
PLUS , 149, 150, 152, 155, 159, 165, 173, 
242, 243 
Polynomials: 
characteristic, 383, 385—388, 393-396 
constant, 170 
content of, 230, 234 
degree of, 169, 170, 172, 211 
derivative of, 355—357 
division of, 211, 212, 215 
evaluation of, 171, 173, 174 
factorization of, 224, 225, 237-239, 364— 
372 
function, 171, 175 
irreducible, 221—225, 230—234, 270, 271, 
356, 359, 362 
leading coefficient of, 169, 170, 172 
minimal, see Fields, extension; Linear 
Transformations; Matrix 
monic, 215—217 
primitive, 230-232, 234, 363 


ring of, 148, 166-175, 179, 211, 214, 221- 


225, 229-240 


INDEX 489 


roots of: 
adjunction of, 349-352 
multiple, 352-357 
multiplicity of, 352-357 
properties of, 215, 216, 349-357 
splitting fields of, 353-360 
see also Interpolation 
Prime element, 218—221, 228, 269. See also 
Integers, prime 
Principal ideal, 212, 214 
Principal ideal domain (PID), 213, 214, 217, 
219, 220, 229, 258 
Projection, 19, 106 
Projective plane, 35 
Pseudoprime, 88, 91 
p-subgroup, 142 
P20, 122 


Q, 62, 122 

QDIFF , 162, 165 

QINV, 162, 165 

QMATPROD , 178 

QNEG, 162, 165 

QPOWER, 162, 165 

QPROD, 162, 165 

QQUOT, 162, 165 

QSUM, 162, 165 

Quaternion algebras, 276, 277, 280 

Quotient, see Algebras; Groups; Modules; 
Rings; Vector spaces 


Radix, 63 
Range, see Codomain 
Rank, see Matrix; Modules, free 
Rational functions, 209, 239 
Rational numbers, 2, 6, 96, 146, 161, 162, 
207 
RDET, 185, 190, 191, 240 
Real numbers, 2, 6, 94, 96, 146, 161 
Refinement, 10, 32, 53 
Reflexivity, 14—17, 42 
Regular representations, 116, 117, 124, 139 
Relations: 
composition of, 13, 14 
empty, 12 
identity, 12, 14 
implication of, 12 
inverse, 12—14, 23 
properties of, 12-18 


490 INDEX 


Relations (Continued) 
reflexive, 14—17, 42 
symmetric, 14-17 
transitive, 14—17 
trivial, 12, 14 

Restriction, 22 

Rings: 
characteristic of, 157, 206 


commutative, 147, 158, 159, 253, 256, 257 


direct sum of, 150, 152 
division rings, 150, 157 
embeddings of, 207, 209, 210 
of endomorphisms, 261—263 
finite, 147, 149, 164, 165 


homomorphisms of, 155-159, 175, 180, 190 


ideals in, 155-159, 213, 214, 242, 245, 
246, 257 

isomorphism of, 155 

isomorphism theorems for, 157 

group, 153, 279 

of matrices, 148, 149, 175-181, 242 

multiplicative identity in, 146 

opposite, 149, 152, 244, 248 


of polynomials, 148, 166-175, 179, 211, 


214, 221-225, 229-240 
prime elements in, 218—221, 228, 269 
quotient, 156, 157 
subring, 154—158 
theory of, 1, 2, 146-240 
trivial, 148, 
units in, 150, 152, 159, 189, 191-206, 
212, 229, 253, 255, 256, 281, 297 
RLSYS , 346 
RMOD , 242, 243 
Root, see Polynomials 
Row equivalence, see Matrix 
Row operation, see Matrix 
Row reduction, 288—290 
Russell, Bertrand, 10 
RXDEGREE , 172, 173 
RXDET , 185, 240 
RXDIFF , 172, 173 
RXEVAL , 173, 174, 236, 237, 240 
RXFACTOR , 225 
RXGCD, 356 
RXINTERP , 236-240 
RXLEAD, 172, 173 
RXPROD , 172, 173 
RXQUOT, 212, 238 
RXREM, 212, 218, 238 


RXSUM ,172, 173 


R, 46, 52, 58, 73, 298-300, 332, 342, 345, 


390 


SCHV, 21, 29, 30, 34, 37, 39, 130 
Semidirect product, 140 


Semigroup, 70-75, 81, 82. See also Binary 


operation 

SEQREL ,15, 89, 134 

Sets: 
complements, 8 
difference of, 8, 20 
disjoint, 8 
elements (members) of, 5 
empty, 6, 7, 25 
family of, 24, 26, 31, 33 
finite, 7, 24, 25 
infinite, 24 
intersection of, 8, 20, 25 
operations on, 8, 9, 20, 28-32 
paradox concerning, 10 


partition of, 9-12, 15-17, 19, 23, 32, 53 


subset, 7, 20, 25, 31 

theory of, 2, 4-12 

union of, 8, 20, 24 
SFEL, 32, 92, 104 
Sieve, 55, 56, 59, 221—223 
Sign, see Permutations 
Similarity, 373—395 
Span, 245 
Splitting field, 353-360 
SSORT, 6, 11, 60, 86, 92, 93, 104, 121 
SSUB, 7, 8, 21, 34, 37-39, 129 
Stabilizer, 83, 121, 124, 141 
Structure constants, 276-280, 322 
Subfield, see Fields 
Subgroup, see Groups 
Submodule, see Modules 
Subring, see Rings 
Subset, see Sets 
Subspace, see Vector spaces 
Surjection, 23 
Sylow, Ludwig, 142 
Sylow p-subgroup, 142-145 
Sylow theorems, 141-145, 158 


Symmetric group, 79, 81, 83, 110, 114, 125, 


183, 188 
Symmetry, 14-17 


‘Systems of linear equations: 


consistent, 311, 314, 344 


homogeneous, 311-314, 319-322, 346 
inconsistent, 311, 314, 344 
matrix of coefficients, 311 
solution of, 311—323, 343-347 
S, 46, 52, 58, 73, 271, 332, 342, 345, 390 


TIMES , 149, 150, 152, 155, 159, 165, 173, 
242, 243 

Torsion element, 336 

Trace, 393 

Transcendental element, 349, 353 

Transitivity, 14—17, 42 

Transpose, 178, 335 

Transposition, 123, 124, 136, 188 

TRAV, 179, 361 


Unique factorization domain (UFD), 219-221, 
228, 229, 232, 239 
Unit, see Rings 


Vandermonde, Alexandre T., 313 
Vandermonde matrx, 313, 314, 320 
Vector spaces: 

dimension of, 254 

dual basis, 308, 310 

dual space, 308-310 

finite dimensional, 245 

properties of, 244, 252, 303-310, 373 

quotient space, 246 

subspace, 244, 307, 309 

see also Modules, Linear transformations 


Wedderburn, J. H. M., 151 

Wilson, John, 144 

Wilson’s Theorem, 144, 227 
W, 345, 346, 367, 396, 399 


X,122 


ZAINIT, 278 

ZAPROD, 278, 279 

ZASUM, 279 

ZCHREM, 59, 60 

ZDET, 185—187, 190, 191, 253, 258, 266, 
341 

Zero-divisor, 151 

ZFACTOR, 57, 59-61, 63 

ZGCD, 45-48, 52, 58, 60, 73, 115 

ZLCM, 48, 60 


INDEX 491 


ZLSYS, 345, 346 

ZMATINV, 203, 204, 206, 258, 266, 298, 
300, 301, 342 

ZNDET , 185, 191, 367, 396, 400 

ZNDIFF , 160, 161, 165 

ZNINV, 160, 165, 218 

ZNLSYS , 366, 367, 396, 399 

ZNMATINV, 203, 204, 206, 377, 393, 400 

ZNMATPROD, 179, 180, 377, 389, 391-393, 
399, 400, 402 

ZNNEG, 160, 165 

ZNPOWER, 76, 91, 160, 161, 165 

ZNPROD , 160, 161, 165 

ZNROWREDUCE, 301, 305, 306, 316, 318, 
320 

ZNSUM, 160, 161, 165 

ZNXDEGREE , 173 

ZNXDET, 185, 190, 191, 389, 399, 402 

2NXDIFF, 173, 271, 367, 370, 371 

ZNXEVAL , 173, 174, 191 

ZNXFACTOR, 224, 270, 371, 372 

2NXFDIFF , 273 

ZNXPINIT , 272, 273, 279, 360, 365, 367, 
370 

ZNXFINV, 273 

ZNXFPOWER, 273, 360, 366, 367, 370, 371 

ZNXFPROD, 273, 279, 360-362 

ZNXFSUM, 273, 361 

ZNXGCD, 271, 272, 366, 367, 370, 371 

ZNXIRRED, 223, 228, 270, 360 

ZNXLEAD, 173 

LNXMATINV, 390, 391 

ZNXMATPROD , 179 

ZNXMONIC , 223, 228 

ZNXPROD, 173, 212, 218, 271, 279 

ZNXQUOT , 212, 216, 366, 367, 370, 371 

ZNXREDUCE, 333, 334, 383, 384, 389, 390 

LNXREM, 212, 216, 218, 222, 224, 271, 272, 
279 

ZNXROWREDUCE , 301 

QNXRT , 272, 273 

ZNXSUM , 173, 212, 271 

ZPRIMES , 56 

ZQUOT, 41, 42, 46, 63-65, 211 

ZREDUCE , 332, 334, 342, 345 

ZREM, 41, 42, 63, 211, 218 

ZROWREDUCE , 296, 298-300 

ZSC , 278 

LZXINTERP , 236 

ZXMATPROD 180 


ISBN 0-471-09846-9 


